Glue
****


Client
======

class Glue.Client

   A low-level client representing AWS Glue

   Defines the public endpoint for the Glue service.

      import boto3

      client = boto3.client('glue')

These are the available methods:

* batch_create_partition

* batch_delete_connection

* batch_delete_partition

* batch_delete_table

* batch_delete_table_version

* batch_get_blueprints

* batch_get_crawlers

* batch_get_custom_entity_types

* batch_get_data_quality_result

* batch_get_dev_endpoints

* batch_get_jobs

* batch_get_partition

* batch_get_table_optimizer

* batch_get_triggers

* batch_get_workflows

* batch_put_data_quality_statistic_annotation

* batch_stop_job_run

* batch_update_partition

* can_paginate

* cancel_data_quality_rule_recommendation_run

* cancel_data_quality_ruleset_evaluation_run

* cancel_ml_task_run

* cancel_statement

* check_schema_version_validity

* close

* create_blueprint

* create_catalog

* create_classifier

* create_column_statistics_task_settings

* create_connection

* create_crawler

* create_custom_entity_type

* create_data_quality_ruleset

* create_database

* create_dev_endpoint

* create_integration

* create_integration_resource_property

* create_integration_table_properties

* create_job

* create_ml_transform

* create_partition

* create_partition_index

* create_registry

* create_schema

* create_script

* create_security_configuration

* create_session

* create_table

* create_table_optimizer

* create_trigger

* create_usage_profile

* create_user_defined_function

* create_workflow

* delete_blueprint

* delete_catalog

* delete_classifier

* delete_column_statistics_for_partition

* delete_column_statistics_for_table

* delete_column_statistics_task_settings

* delete_connection

* delete_crawler

* delete_custom_entity_type

* delete_data_quality_ruleset

* delete_database

* delete_dev_endpoint

* delete_integration

* delete_integration_table_properties

* delete_job

* delete_ml_transform

* delete_partition

* delete_partition_index

* delete_registry

* delete_resource_policy

* delete_schema

* delete_schema_versions

* delete_security_configuration

* delete_session

* delete_table

* delete_table_optimizer

* delete_table_version

* delete_trigger

* delete_usage_profile

* delete_user_defined_function

* delete_workflow

* describe_connection_type

* describe_entity

* describe_inbound_integrations

* describe_integrations

* get_blueprint

* get_blueprint_run

* get_blueprint_runs

* get_catalog

* get_catalog_import_status

* get_catalogs

* get_classifier

* get_classifiers

* get_column_statistics_for_partition

* get_column_statistics_for_table

* get_column_statistics_task_run

* get_column_statistics_task_runs

* get_column_statistics_task_settings

* get_connection

* get_connections

* get_crawler

* get_crawler_metrics

* get_crawlers

* get_custom_entity_type

* get_data_catalog_encryption_settings

* get_data_quality_model

* get_data_quality_model_result

* get_data_quality_result

* get_data_quality_rule_recommendation_run

* get_data_quality_ruleset

* get_data_quality_ruleset_evaluation_run

* get_database

* get_databases

* get_dataflow_graph

* get_dev_endpoint

* get_dev_endpoints

* get_entity_records

* get_integration_resource_property

* get_integration_table_properties

* get_job

* get_job_bookmark

* get_job_run

* get_job_runs

* get_jobs

* get_mapping

* get_ml_task_run

* get_ml_task_runs

* get_ml_transform

* get_ml_transforms

* get_paginator

* get_partition

* get_partition_indexes

* get_partitions

* get_plan

* get_registry

* get_resource_policies

* get_resource_policy

* get_schema

* get_schema_by_definition

* get_schema_version

* get_schema_versions_diff

* get_security_configuration

* get_security_configurations

* get_session

* get_statement

* get_table

* get_table_optimizer

* get_table_version

* get_table_versions

* get_tables

* get_tags

* get_trigger

* get_triggers

* get_unfiltered_partition_metadata

* get_unfiltered_partitions_metadata

* get_unfiltered_table_metadata

* get_usage_profile

* get_user_defined_function

* get_user_defined_functions

* get_waiter

* get_workflow

* get_workflow_run

* get_workflow_run_properties

* get_workflow_runs

* import_catalog_to_glue

* list_blueprints

* list_column_statistics_task_runs

* list_connection_types

* list_crawlers

* list_crawls

* list_custom_entity_types

* list_data_quality_results

* list_data_quality_rule_recommendation_runs

* list_data_quality_ruleset_evaluation_runs

* list_data_quality_rulesets

* list_data_quality_statistic_annotations

* list_data_quality_statistics

* list_dev_endpoints

* list_entities

* list_jobs

* list_ml_transforms

* list_registries

* list_schema_versions

* list_schemas

* list_sessions

* list_statements

* list_table_optimizer_runs

* list_triggers

* list_usage_profiles

* list_workflows

* modify_integration

* put_data_catalog_encryption_settings

* put_data_quality_profile_annotation

* put_resource_policy

* put_schema_version_metadata

* put_workflow_run_properties

* query_schema_version_metadata

* register_schema_version

* remove_schema_version_metadata

* reset_job_bookmark

* resume_workflow_run

* run_statement

* search_tables

* start_blueprint_run

* start_column_statistics_task_run

* start_column_statistics_task_run_schedule

* start_crawler

* start_crawler_schedule

* start_data_quality_rule_recommendation_run

* start_data_quality_ruleset_evaluation_run

* start_export_labels_task_run

* start_import_labels_task_run

* start_job_run

* start_ml_evaluation_task_run

* start_ml_labeling_set_generation_task_run

* start_trigger

* start_workflow_run

* stop_column_statistics_task_run

* stop_column_statistics_task_run_schedule

* stop_crawler

* stop_crawler_schedule

* stop_session

* stop_trigger

* stop_workflow_run

* tag_resource

* test_connection

* untag_resource

* update_blueprint

* update_catalog

* update_classifier

* update_column_statistics_for_partition

* update_column_statistics_for_table

* update_column_statistics_task_settings

* update_connection

* update_crawler

* update_crawler_schedule

* update_data_quality_ruleset

* update_database

* update_dev_endpoint

* update_integration_resource_property

* update_integration_table_properties

* update_job

* update_job_from_source_control

* update_ml_transform

* update_partition

* update_registry

* update_schema

* update_source_control_from_job

* update_table

* update_table_optimizer

* update_trigger

* update_usage_profile

* update_user_defined_function

* update_workflow


Paginators
==========

Paginators are available on a client instance via the "get_paginator"
method. For more detailed instructions and examples on the usage of
paginators, see the paginators user guide.

The available paginators are:

* DescribeEntity

* GetClassifiers

* GetConnections

* GetCrawlerMetrics

* GetCrawlers

* GetDatabases

* GetDevEndpoints

* GetJobRuns

* GetJobs

* GetPartitionIndexes

* GetPartitions

* GetResourcePolicies

* GetSecurityConfigurations

* GetTableVersions

* GetTables

* GetTriggers

* GetUserDefinedFunctions

* GetWorkflowRuns

* ListBlueprints

* ListConnectionTypes

* ListEntities

* ListJobs

* ListRegistries

* ListSchemaVersions

* ListSchemas

* ListTableOptimizerRuns

* ListTriggers

* ListUsageProfiles

* ListWorkflows
Glue / Paginator / GetPartitions


GetPartitions
*************

class Glue.Paginator.GetPartitions

      paginator = client.get_paginator('get_partitions')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_partitions()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             CatalogId='string',
             DatabaseName='string',
             TableName='string',
             Expression='string',
             Segment={
                 'SegmentNumber': 123,
                 'TotalSegments': 123
             },
             ExcludeColumnSchema=True|False,
             TransactionId='string',
             QueryAsOfTime=datetime(2015, 1, 1),
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **CatalogId** (*string*) -- The ID of the Data Catalog
           where the partitions in question reside. If none is
           provided, the Amazon Web Services account ID is used by
           default.

         * **DatabaseName** (*string*) --

           **[REQUIRED]**

           The name of the catalog database where the partitions
           reside.

         * **TableName** (*string*) --

           **[REQUIRED]**

           The name of the partitions' table.

         * **Expression** (*string*) --

           An expression that filters the partitions to be returned.

           The expression uses SQL syntax similar to the SQL "WHERE"
           filter clause. The SQL statement parser JSQLParser parses
           the expression.

           *Operators*: The following are the operators that you can
           use in the "Expression" API call:

              =

           Checks whether the values of the two operands are equal; if
           yes, then the condition becomes true.

           Example: Assume 'variable a' holds 10 and 'variable b'
           holds 20.

           (a = b) is not true.

              < >

           Checks whether the values of two operands are equal; if the
           values are not equal, then the condition becomes true.

           Example: (a < > b) is true.

              >

           Checks whether the value of the left operand is greater
           than the value of the right operand; if yes, then the
           condition becomes true.

           Example: (a > b) is not true.

              <

           Checks whether the value of the left operand is less than
           the value of the right operand; if yes, then the condition
           becomes true.

           Example: (a < b) is true.

              >=

           Checks whether the value of the left operand is greater
           than or equal to the value of the right operand; if yes,
           then the condition becomes true.

           Example: (a >= b) is not true.

              <=

           Checks whether the value of the left operand is less than
           or equal to the value of the right operand; if yes, then
           the condition becomes true.

           Example: (a <= b) is true.

              AND, OR, IN, BETWEEN, LIKE, NOT, IS NULL

           Logical operators.

           *Supported Partition Key Types*: The following are the
           supported partition keys.

           * "string"

           * "date"

           * "timestamp"

           * "int"

           * "bigint"

           * "long"

           * "tinyint"

           * "smallint"

           * "decimal"

           If an type is encountered that is not valid, an exception
           is thrown.

           The following list shows the valid operators on each type.
           When you define a crawler, the "partitionKey" type is
           created as a "STRING", to be compatible with the catalog
           partitions.

           *Sample API Call*:

         * **Segment** (*dict*) --

           The segment of the table's partitions to scan in this
           request.

           * **SegmentNumber** *(integer) --* **[REQUIRED]**

             The zero-based index number of the segment. For example,
             if the total number of segments is 4, "SegmentNumber"
             values range from 0 through 3.

           * **TotalSegments** *(integer) --* **[REQUIRED]**

             The total number of segments.

         * **ExcludeColumnSchema** (*boolean*) -- When true, specifies
           not returning the partition column schema. Useful when you
           are interested only in other partition attributes such as
           partition values or location. This approach avoids the
           problem of a large response by not returning duplicate
           data.

         * **TransactionId** (*string*) -- The transaction ID at which
           to read the partition contents.

         * **QueryAsOfTime** (*datetime*) -- The time as of when to
           read the partition contents. If not set, the most recent
           transaction commit time will be used. Cannot be specified
           along with "TransactionId".

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Partitions': [
                    {
                        'Values': [
                            'string',
                        ],
                        'DatabaseName': 'string',
                        'TableName': 'string',
                        'CreationTime': datetime(2015, 1, 1),
                        'LastAccessTime': datetime(2015, 1, 1),
                        'StorageDescriptor': {
                            'Columns': [
                                {
                                    'Name': 'string',
                                    'Type': 'string',
                                    'Comment': 'string',
                                    'Parameters': {
                                        'string': 'string'
                                    }
                                },
                            ],
                            'Location': 'string',
                            'AdditionalLocations': [
                                'string',
                            ],
                            'InputFormat': 'string',
                            'OutputFormat': 'string',
                            'Compressed': True|False,
                            'NumberOfBuckets': 123,
                            'SerdeInfo': {
                                'Name': 'string',
                                'SerializationLibrary': 'string',
                                'Parameters': {
                                    'string': 'string'
                                }
                            },
                            'BucketColumns': [
                                'string',
                            ],
                            'SortColumns': [
                                {
                                    'Column': 'string',
                                    'SortOrder': 123
                                },
                            ],
                            'Parameters': {
                                'string': 'string'
                            },
                            'SkewedInfo': {
                                'SkewedColumnNames': [
                                    'string',
                                ],
                                'SkewedColumnValues': [
                                    'string',
                                ],
                                'SkewedColumnValueLocationMaps': {
                                    'string': 'string'
                                }
                            },
                            'StoredAsSubDirectories': True|False,
                            'SchemaReference': {
                                'SchemaId': {
                                    'SchemaArn': 'string',
                                    'SchemaName': 'string',
                                    'RegistryName': 'string'
                                },
                                'SchemaVersionId': 'string',
                                'SchemaVersionNumber': 123
                            }
                        },
                        'Parameters': {
                            'string': 'string'
                        },
                        'LastAnalyzedTime': datetime(2015, 1, 1),
                        'CatalogId': 'string'
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Partitions** *(list) --*

             A list of requested partitions.

             * *(dict) --*

               Represents a slice of table data.

               * **Values** *(list) --*

                 The values of the partition.

                 * *(string) --*

               * **DatabaseName** *(string) --*

                 The name of the catalog database in which to create
                 the partition.

               * **TableName** *(string) --*

                 The name of the database table in which to create the
                 partition.

               * **CreationTime** *(datetime) --*

                 The time at which the partition was created.

               * **LastAccessTime** *(datetime) --*

                 The last time at which the partition was accessed.

               * **StorageDescriptor** *(dict) --*

                 Provides information about the physical location
                 where the partition is stored.

                 * **Columns** *(list) --*

                   A list of the "Columns" in the table.

                   * *(dict) --*

                     A column in a "Table".

                     * **Name** *(string) --*

                       The name of the "Column".

                     * **Type** *(string) --*

                       The data type of the "Column".

                     * **Comment** *(string) --*

                       A free-form text comment.

                     * **Parameters** *(dict) --*

                       These key-value pairs define properties
                       associated with the column.

                       * *(string) --*

                         * *(string) --*

                 * **Location** *(string) --*

                   The physical location of the table. By default,
                   this takes the form of the warehouse location,
                   followed by the database location in the warehouse,
                   followed by the table name.

                 * **AdditionalLocations** *(list) --*

                   A list of locations that point to the path where a
                   Delta table is located.

                   * *(string) --*

                 * **InputFormat** *(string) --*

                   The input format: "SequenceFileInputFormat"
                   (binary), or "TextInputFormat", or a custom format.

                 * **OutputFormat** *(string) --*

                   The output format: "SequenceFileOutputFormat"
                   (binary), or "IgnoreKeyTextOutputFormat", or a
                   custom format.

                 * **Compressed** *(boolean) --*

                   "True" if the data in the table is compressed, or
                   "False" if not.

                 * **NumberOfBuckets** *(integer) --*

                   Must be specified if the table contains any
                   dimension columns.

                 * **SerdeInfo** *(dict) --*

                   The serialization/deserialization (SerDe)
                   information.

                   * **Name** *(string) --*

                     Name of the SerDe.

                   * **SerializationLibrary** *(string) --*

                     Usually the class that implements the SerDe. An
                     example is "org.apache.hadoop.hive.serde2.column
                     ar.ColumnarSerDe".

                   * **Parameters** *(dict) --*

                     These key-value pairs define initialization
                     parameters for the SerDe.

                     * *(string) --*

                       * *(string) --*

                 * **BucketColumns** *(list) --*

                   A list of reducer grouping columns, clustering
                   columns, and bucketing columns in the table.

                   * *(string) --*

                 * **SortColumns** *(list) --*

                   A list specifying the sort order of each bucket in
                   the table.

                   * *(dict) --*

                     Specifies the sort order of a sorted column.

                     * **Column** *(string) --*

                       The name of the column.

                     * **SortOrder** *(integer) --*

                       Indicates that the column is sorted in
                       ascending order ( "== 1"), or in descending
                       order ( "==0").

                 * **Parameters** *(dict) --*

                   The user-supplied properties in key-value form.

                   * *(string) --*

                     * *(string) --*

                 * **SkewedInfo** *(dict) --*

                   The information about values that appear frequently
                   in a column (skewed values).

                   * **SkewedColumnNames** *(list) --*

                     A list of names of columns that contain skewed
                     values.

                     * *(string) --*

                   * **SkewedColumnValues** *(list) --*

                     A list of values that appear so frequently as to
                     be considered skewed.

                     * *(string) --*

                   * **SkewedColumnValueLocationMaps** *(dict) --*

                     A mapping of skewed values to the columns that
                     contain them.

                     * *(string) --*

                       * *(string) --*

                 * **StoredAsSubDirectories** *(boolean) --*

                   "True" if the table data is stored in
                   subdirectories, or "False" if not.

                 * **SchemaReference** *(dict) --*

                   An object that references a schema stored in the
                   Glue Schema Registry.

                   When creating a table, you can pass an empty list
                   of columns for the schema, and instead use a schema
                   reference.

                   * **SchemaId** *(dict) --*

                     A structure that contains schema identity fields.
                     Either this or the "SchemaVersionId" has to be
                     provided.

                     * **SchemaArn** *(string) --*

                       The Amazon Resource Name (ARN) of the schema.
                       One of "SchemaArn" or "SchemaName" has to be
                       provided.

                     * **SchemaName** *(string) --*

                       The name of the schema. One of "SchemaArn" or
                       "SchemaName" has to be provided.

                     * **RegistryName** *(string) --*

                       The name of the schema registry that contains
                       the schema.

                   * **SchemaVersionId** *(string) --*

                     The unique ID assigned to a version of the
                     schema. Either this or the "SchemaId" has to be
                     provided.

                   * **SchemaVersionNumber** *(integer) --*

                     The version number of the schema.

               * **Parameters** *(dict) --*

                 These key-value pairs define partition parameters.

                 * *(string) --*

                   * *(string) --*

               * **LastAnalyzedTime** *(datetime) --*

                 The last time at which column statistics were
                 computed for this partition.

               * **CatalogId** *(string) --*

                 The ID of the Data Catalog in which the partition
                 resides.
Glue / Paginator / ListSchemaVersions


ListSchemaVersions
******************

class Glue.Paginator.ListSchemaVersions

      paginator = client.get_paginator('list_schema_versions')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_schema_versions()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             SchemaId={
                 'SchemaArn': 'string',
                 'SchemaName': 'string',
                 'RegistryName': 'string'
             },
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **SchemaId** (*dict*) --

           **[REQUIRED]**

           This is a wrapper structure to contain schema identity
           fields. The structure contains:

           * SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the
             schema. Either "SchemaArn" or "SchemaName" and
             "RegistryName" has to be provided.

           * SchemaId$SchemaName: The name of the schema. Either
             "SchemaArn" or "SchemaName" and "RegistryName" has to be
             provided.

           * **SchemaArn** *(string) --*

             The Amazon Resource Name (ARN) of the schema. One of
             "SchemaArn" or "SchemaName" has to be provided.

           * **SchemaName** *(string) --*

             The name of the schema. One of "SchemaArn" or
             "SchemaName" has to be provided.

           * **RegistryName** *(string) --*

             The name of the schema registry that contains the schema.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Schemas': [
                    {
                        'SchemaArn': 'string',
                        'SchemaVersionId': 'string',
                        'VersionNumber': 123,
                        'Status': 'AVAILABLE'|'PENDING'|'FAILURE'|'DELETING',
                        'CreatedTime': 'string'
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Schemas** *(list) --*

             An array of "SchemaVersionList" objects containing
             details of each schema version.

             * *(dict) --*

               An object containing the details about a schema
               version.

               * **SchemaArn** *(string) --*

                 The Amazon Resource Name (ARN) of the schema.

               * **SchemaVersionId** *(string) --*

                 The unique identifier of the schema version.

               * **VersionNumber** *(integer) --*

                 The version number of the schema.

               * **Status** *(string) --*

                 The status of the schema version.

               * **CreatedTime** *(string) --*

                 The date and time the schema version was created.
Glue / Paginator / GetTableVersions


GetTableVersions
****************

class Glue.Paginator.GetTableVersions

      paginator = client.get_paginator('get_table_versions')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_table_versions()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             CatalogId='string',
             DatabaseName='string',
             TableName='string',
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **CatalogId** (*string*) -- The ID of the Data Catalog
           where the tables reside. If none is provided, the Amazon
           Web Services account ID is used by default.

         * **DatabaseName** (*string*) --

           **[REQUIRED]**

           The database in the catalog in which the table resides. For
           Hive compatibility, this name is entirely lowercase.

         * **TableName** (*string*) --

           **[REQUIRED]**

           The name of the table. For Hive compatibility, this name is
           entirely lowercase.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'TableVersions': [
                    {
                        'Table': {
                            'Name': 'string',
                            'DatabaseName': 'string',
                            'Description': 'string',
                            'Owner': 'string',
                            'CreateTime': datetime(2015, 1, 1),
                            'UpdateTime': datetime(2015, 1, 1),
                            'LastAccessTime': datetime(2015, 1, 1),
                            'LastAnalyzedTime': datetime(2015, 1, 1),
                            'Retention': 123,
                            'StorageDescriptor': {
                                'Columns': [
                                    {
                                        'Name': 'string',
                                        'Type': 'string',
                                        'Comment': 'string',
                                        'Parameters': {
                                            'string': 'string'
                                        }
                                    },
                                ],
                                'Location': 'string',
                                'AdditionalLocations': [
                                    'string',
                                ],
                                'InputFormat': 'string',
                                'OutputFormat': 'string',
                                'Compressed': True|False,
                                'NumberOfBuckets': 123,
                                'SerdeInfo': {
                                    'Name': 'string',
                                    'SerializationLibrary': 'string',
                                    'Parameters': {
                                        'string': 'string'
                                    }
                                },
                                'BucketColumns': [
                                    'string',
                                ],
                                'SortColumns': [
                                    {
                                        'Column': 'string',
                                        'SortOrder': 123
                                    },
                                ],
                                'Parameters': {
                                    'string': 'string'
                                },
                                'SkewedInfo': {
                                    'SkewedColumnNames': [
                                        'string',
                                    ],
                                    'SkewedColumnValues': [
                                        'string',
                                    ],
                                    'SkewedColumnValueLocationMaps': {
                                        'string': 'string'
                                    }
                                },
                                'StoredAsSubDirectories': True|False,
                                'SchemaReference': {
                                    'SchemaId': {
                                        'SchemaArn': 'string',
                                        'SchemaName': 'string',
                                        'RegistryName': 'string'
                                    },
                                    'SchemaVersionId': 'string',
                                    'SchemaVersionNumber': 123
                                }
                            },
                            'PartitionKeys': [
                                {
                                    'Name': 'string',
                                    'Type': 'string',
                                    'Comment': 'string',
                                    'Parameters': {
                                        'string': 'string'
                                    }
                                },
                            ],
                            'ViewOriginalText': 'string',
                            'ViewExpandedText': 'string',
                            'TableType': 'string',
                            'Parameters': {
                                'string': 'string'
                            },
                            'CreatedBy': 'string',
                            'IsRegisteredWithLakeFormation': True|False,
                            'TargetTable': {
                                'CatalogId': 'string',
                                'DatabaseName': 'string',
                                'Name': 'string',
                                'Region': 'string'
                            },
                            'CatalogId': 'string',
                            'VersionId': 'string',
                            'FederatedTable': {
                                'Identifier': 'string',
                                'DatabaseIdentifier': 'string',
                                'ConnectionName': 'string',
                                'ConnectionType': 'string'
                            },
                            'ViewDefinition': {
                                'IsProtected': True|False,
                                'Definer': 'string',
                                'SubObjects': [
                                    'string',
                                ],
                                'Representations': [
                                    {
                                        'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                        'DialectVersion': 'string',
                                        'ViewOriginalText': 'string',
                                        'ViewExpandedText': 'string',
                                        'ValidationConnection': 'string',
                                        'IsStale': True|False
                                    },
                                ]
                            },
                            'IsMultiDialectView': True|False,
                            'Status': {
                                'RequestedBy': 'string',
                                'UpdatedBy': 'string',
                                'RequestTime': datetime(2015, 1, 1),
                                'UpdateTime': datetime(2015, 1, 1),
                                'Action': 'UPDATE'|'CREATE',
                                'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                                'Error': {
                                    'ErrorCode': 'string',
                                    'ErrorMessage': 'string'
                                },
                                'Details': {
                                    'RequestedChange': {'... recursive ...'},
                                    'ViewValidations': [
                                        {
                                            'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                            'DialectVersion': 'string',
                                            'ViewValidationText': 'string',
                                            'UpdateTime': datetime(2015, 1, 1),
                                            'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                                            'Error': {
                                                'ErrorCode': 'string',
                                                'ErrorMessage': 'string'
                                            }
                                        },
                                    ]
                                }
                            }
                        },
                        'VersionId': 'string'
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **TableVersions** *(list) --*

             A list of strings identifying available versions of the
             specified table.

             * *(dict) --*

               Specifies a version of a table.

               * **Table** *(dict) --*

                 The table in question.

                 * **Name** *(string) --*

                   The table name. For Hive compatibility, this must
                   be entirely lowercase.

                 * **DatabaseName** *(string) --*

                   The name of the database where the table metadata
                   resides. For Hive compatibility, this must be all
                   lowercase.

                 * **Description** *(string) --*

                   A description of the table.

                 * **Owner** *(string) --*

                   The owner of the table.

                 * **CreateTime** *(datetime) --*

                   The time when the table definition was created in
                   the Data Catalog.

                 * **UpdateTime** *(datetime) --*

                   The last time that the table was updated.

                 * **LastAccessTime** *(datetime) --*

                   The last time that the table was accessed. This is
                   usually taken from HDFS, and might not be reliable.

                 * **LastAnalyzedTime** *(datetime) --*

                   The last time that column statistics were computed
                   for this table.

                 * **Retention** *(integer) --*

                   The retention time for this table.

                 * **StorageDescriptor** *(dict) --*

                   A storage descriptor containing information about
                   the physical storage of this table.

                   * **Columns** *(list) --*

                     A list of the "Columns" in the table.

                     * *(dict) --*

                       A column in a "Table".

                       * **Name** *(string) --*

                         The name of the "Column".

                       * **Type** *(string) --*

                         The data type of the "Column".

                       * **Comment** *(string) --*

                         A free-form text comment.

                       * **Parameters** *(dict) --*

                         These key-value pairs define properties
                         associated with the column.

                         * *(string) --*

                           * *(string) --*

                   * **Location** *(string) --*

                     The physical location of the table. By default,
                     this takes the form of the warehouse location,
                     followed by the database location in the
                     warehouse, followed by the table name.

                   * **AdditionalLocations** *(list) --*

                     A list of locations that point to the path where
                     a Delta table is located.

                     * *(string) --*

                   * **InputFormat** *(string) --*

                     The input format: "SequenceFileInputFormat"
                     (binary), or "TextInputFormat", or a custom
                     format.

                   * **OutputFormat** *(string) --*

                     The output format: "SequenceFileOutputFormat"
                     (binary), or "IgnoreKeyTextOutputFormat", or a
                     custom format.

                   * **Compressed** *(boolean) --*

                     "True" if the data in the table is compressed, or
                     "False" if not.

                   * **NumberOfBuckets** *(integer) --*

                     Must be specified if the table contains any
                     dimension columns.

                   * **SerdeInfo** *(dict) --*

                     The serialization/deserialization (SerDe)
                     information.

                     * **Name** *(string) --*

                       Name of the SerDe.

                     * **SerializationLibrary** *(string) --*

                       Usually the class that implements the SerDe. An
                       example is "org.apache.hadoop.hive.serde2.colu
                       mnar.ColumnarSerDe".

                     * **Parameters** *(dict) --*

                       These key-value pairs define initialization
                       parameters for the SerDe.

                       * *(string) --*

                         * *(string) --*

                   * **BucketColumns** *(list) --*

                     A list of reducer grouping columns, clustering
                     columns, and bucketing columns in the table.

                     * *(string) --*

                   * **SortColumns** *(list) --*

                     A list specifying the sort order of each bucket
                     in the table.

                     * *(dict) --*

                       Specifies the sort order of a sorted column.

                       * **Column** *(string) --*

                         The name of the column.

                       * **SortOrder** *(integer) --*

                         Indicates that the column is sorted in
                         ascending order ( "== 1"), or in descending
                         order ( "==0").

                   * **Parameters** *(dict) --*

                     The user-supplied properties in key-value form.

                     * *(string) --*

                       * *(string) --*

                   * **SkewedInfo** *(dict) --*

                     The information about values that appear
                     frequently in a column (skewed values).

                     * **SkewedColumnNames** *(list) --*

                       A list of names of columns that contain skewed
                       values.

                       * *(string) --*

                     * **SkewedColumnValues** *(list) --*

                       A list of values that appear so frequently as
                       to be considered skewed.

                       * *(string) --*

                     * **SkewedColumnValueLocationMaps** *(dict) --*

                       A mapping of skewed values to the columns that
                       contain them.

                       * *(string) --*

                         * *(string) --*

                   * **StoredAsSubDirectories** *(boolean) --*

                     "True" if the table data is stored in
                     subdirectories, or "False" if not.

                   * **SchemaReference** *(dict) --*

                     An object that references a schema stored in the
                     Glue Schema Registry.

                     When creating a table, you can pass an empty list
                     of columns for the schema, and instead use a
                     schema reference.

                     * **SchemaId** *(dict) --*

                       A structure that contains schema identity
                       fields. Either this or the "SchemaVersionId"
                       has to be provided.

                       * **SchemaArn** *(string) --*

                         The Amazon Resource Name (ARN) of the schema.
                         One of "SchemaArn" or "SchemaName" has to be
                         provided.

                       * **SchemaName** *(string) --*

                         The name of the schema. One of "SchemaArn" or
                         "SchemaName" has to be provided.

                       * **RegistryName** *(string) --*

                         The name of the schema registry that contains
                         the schema.

                     * **SchemaVersionId** *(string) --*

                       The unique ID assigned to a version of the
                       schema. Either this or the "SchemaId" has to be
                       provided.

                     * **SchemaVersionNumber** *(integer) --*

                       The version number of the schema.

                 * **PartitionKeys** *(list) --*

                   A list of columns by which the table is
                   partitioned. Only primitive types are supported as
                   partition keys.

                   When you create a table used by Amazon Athena, and
                   you do not specify any "partitionKeys", you must at
                   least set the value of "partitionKeys" to an empty
                   list. For example:

                   ""PartitionKeys": []"

                   * *(dict) --*

                     A column in a "Table".

                     * **Name** *(string) --*

                       The name of the "Column".

                     * **Type** *(string) --*

                       The data type of the "Column".

                     * **Comment** *(string) --*

                       A free-form text comment.

                     * **Parameters** *(dict) --*

                       These key-value pairs define properties
                       associated with the column.

                       * *(string) --*

                         * *(string) --*

                 * **ViewOriginalText** *(string) --*

                   Included for Apache Hive compatibility. Not used in
                   the normal course of Glue operations. If the table
                   is a "VIRTUAL_VIEW", certain Athena configuration
                   encoded in base64.

                 * **ViewExpandedText** *(string) --*

                   Included for Apache Hive compatibility. Not used in
                   the normal course of Glue operations.

                 * **TableType** *(string) --*

                   The type of this table. Glue will create tables
                   with the "EXTERNAL_TABLE" type. Other services,
                   such as Athena, may create tables with additional
                   table types.

                   Glue related table types:

                      EXTERNAL_TABLE

                   Hive compatible attribute - indicates a non-Hive
                   managed table.

                      GOVERNED

                   Used by Lake Formation. The Glue Data Catalog
                   understands "GOVERNED".

                 * **Parameters** *(dict) --*

                   These key-value pairs define properties associated
                   with the table.

                   * *(string) --*

                     * *(string) --*

                 * **CreatedBy** *(string) --*

                   The person or entity who created the table.

                 * **IsRegisteredWithLakeFormation** *(boolean) --*

                   Indicates whether the table has been registered
                   with Lake Formation.

                 * **TargetTable** *(dict) --*

                   A "TableIdentifier" structure that describes a
                   target table for resource linking.

                   * **CatalogId** *(string) --*

                     The ID of the Data Catalog in which the table
                     resides.

                   * **DatabaseName** *(string) --*

                     The name of the catalog database that contains
                     the target table.

                   * **Name** *(string) --*

                     The name of the target table.

                   * **Region** *(string) --*

                     Region of the target table.

                 * **CatalogId** *(string) --*

                   The ID of the Data Catalog in which the table
                   resides.

                 * **VersionId** *(string) --*

                   The ID of the table version.

                 * **FederatedTable** *(dict) --*

                   A "FederatedTable" structure that references an
                   entity outside the Glue Data Catalog.

                   * **Identifier** *(string) --*

                     A unique identifier for the federated table.

                   * **DatabaseIdentifier** *(string) --*

                     A unique identifier for the federated database.

                   * **ConnectionName** *(string) --*

                     The name of the connection to the external
                     metastore.

                   * **ConnectionType** *(string) --*

                     The type of connection used to access the
                     federated table, specifying the protocol or
                     method for connecting to the external data
                     source.

                 * **ViewDefinition** *(dict) --*

                   A structure that contains all the information that
                   defines the view, including the dialect or dialects
                   for the view, and the query.

                   * **IsProtected** *(boolean) --*

                     You can set this flag as true to instruct the
                     engine not to push user-provided operations into
                     the logical plan of the view during query
                     planning. However, setting this flag does not
                     guarantee that the engine will comply. Refer to
                     the engine's documentation to understand the
                     guarantees provided, if any.

                   * **Definer** *(string) --*

                     The definer of a view in SQL.

                   * **SubObjects** *(list) --*

                     A list of table Amazon Resource Names (ARNs).

                     * *(string) --*

                   * **Representations** *(list) --*

                     A list of representations.

                     * *(dict) --*

                       A structure that contains the dialect of the
                       view, and the query that defines the view.

                       * **Dialect** *(string) --*

                         The dialect of the query engine.

                       * **DialectVersion** *(string) --*

                         The version of the dialect of the query
                         engine. For example, 3.0.0.

                       * **ViewOriginalText** *(string) --*

                         The "SELECT" query provided by the customer
                         during "CREATE VIEW DDL". This SQL is not
                         used during a query on a view (
                         "ViewExpandedText" is used instead).
                         "ViewOriginalText" is used for cases like
                         "SHOW CREATE VIEW" where users want to see
                         the original DDL command that created the
                         view.

                       * **ViewExpandedText** *(string) --*

                         The expanded SQL for the view. This SQL is
                         used by engines while processing a query on a
                         view. Engines may perform operations during
                         view creation to transform "ViewOriginalText"
                         to "ViewExpandedText". For example:

                         * Fully qualified identifiers: "SELECT * from
                           table1 -> SELECT * from db1.table1"

                       * **ValidationConnection** *(string) --*

                         The name of the connection to be used to
                         validate the specific representation of the
                         view.

                       * **IsStale** *(boolean) --*

                         Dialects marked as stale are no longer valid
                         and must be updated before they can be
                         queried in their respective query engines.

                 * **IsMultiDialectView** *(boolean) --*

                   Specifies whether the view supports the SQL
                   dialects of one or more different query engines and
                   can therefore be read by those engines.

                 * **Status** *(dict) --*

                   A structure containing information about the state
                   of an asynchronous change to a table.

                   * **RequestedBy** *(string) --*

                     The ARN of the user who requested the
                     asynchronous change.

                   * **UpdatedBy** *(string) --*

                     The ARN of the user to last manually alter the
                     asynchronous change (requesting cancellation,
                     etc).

                   * **RequestTime** *(datetime) --*

                     An ISO 8601 formatted date string indicating the
                     time that the change was initiated.

                   * **UpdateTime** *(datetime) --*

                     An ISO 8601 formatted date string indicating the
                     time that the state was last updated.

                   * **Action** *(string) --*

                     Indicates which action was called on the table,
                     currently only "CREATE" or "UPDATE".

                   * **State** *(string) --*

                     A generic status for the change in progress, such
                     as QUEUED, IN_PROGRESS, SUCCESS, or FAILED.

                   * **Error** *(dict) --*

                     An error that will only appear when the state is
                     "FAILED". This is a parent level exception
                     message, there may be different >>``<<Error``s
                     for each dialect.

                     * **ErrorCode** *(string) --*

                       The code associated with this error.

                     * **ErrorMessage** *(string) --*

                       A message describing the error.

                   * **Details** *(dict) --*

                     A "StatusDetails" object with information about
                     the requested change.

                     * **RequestedChange** *(dict) --*

                       A "Table" object representing the requested
                       changes.

                     * **ViewValidations** *(list) --*

                       A list of "ViewValidation" objects that contain
                       information for an analytical engine to
                       validate a view.

                       * *(dict) --*

                         A structure that contains information for an
                         analytical engine to validate a view, prior
                         to persisting the view metadata. Used in the
                         case of direct "UpdateTable" or "CreateTable"
                         API calls.

                         * **Dialect** *(string) --*

                           The dialect of the query engine.

                         * **DialectVersion** *(string) --*

                           The version of the dialect of the query
                           engine. For example, 3.0.0.

                         * **ViewValidationText** *(string) --*

                           The "SELECT" query that defines the view,
                           as provided by the customer.

                         * **UpdateTime** *(datetime) --*

                           The time of the last update.

                         * **State** *(string) --*

                           The state of the validation.

                         * **Error** *(dict) --*

                           An error associated with the validation.

                           * **ErrorCode** *(string) --*

                             The code associated with this error.

                           * **ErrorMessage** *(string) --*

                             A message describing the error.

               * **VersionId** *(string) --*

                 The ID value that identifies this table version. A
                 "VersionId" is a string representation of an integer.
                 Each version is incremented by 1.
Glue / Paginator / GetDevEndpoints


GetDevEndpoints
***************

class Glue.Paginator.GetDevEndpoints

      paginator = client.get_paginator('get_dev_endpoints')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_dev_endpoints()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'DevEndpoints': [
                    {
                        'EndpointName': 'string',
                        'RoleArn': 'string',
                        'SecurityGroupIds': [
                            'string',
                        ],
                        'SubnetId': 'string',
                        'YarnEndpointAddress': 'string',
                        'PrivateAddress': 'string',
                        'ZeppelinRemoteSparkInterpreterPort': 123,
                        'PublicAddress': 'string',
                        'Status': 'string',
                        'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                        'GlueVersion': 'string',
                        'NumberOfWorkers': 123,
                        'NumberOfNodes': 123,
                        'AvailabilityZone': 'string',
                        'VpcId': 'string',
                        'ExtraPythonLibsS3Path': 'string',
                        'ExtraJarsS3Path': 'string',
                        'FailureReason': 'string',
                        'LastUpdateStatus': 'string',
                        'CreatedTimestamp': datetime(2015, 1, 1),
                        'LastModifiedTimestamp': datetime(2015, 1, 1),
                        'PublicKey': 'string',
                        'PublicKeys': [
                            'string',
                        ],
                        'SecurityConfiguration': 'string',
                        'Arguments': {
                            'string': 'string'
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **DevEndpoints** *(list) --*

             A list of "DevEndpoint" definitions.

             * *(dict) --*

               A development endpoint where a developer can remotely
               debug extract, transform, and load (ETL) scripts.

               * **EndpointName** *(string) --*

                 The name of the "DevEndpoint".

               * **RoleArn** *(string) --*

                 The Amazon Resource Name (ARN) of the IAM role used
                 in this "DevEndpoint".

               * **SecurityGroupIds** *(list) --*

                 A list of security group identifiers used in this
                 "DevEndpoint".

                 * *(string) --*

               * **SubnetId** *(string) --*

                 The subnet ID for this "DevEndpoint".

               * **YarnEndpointAddress** *(string) --*

                 The YARN endpoint address used by this "DevEndpoint".

               * **PrivateAddress** *(string) --*

                 A private IP address to access the "DevEndpoint"
                 within a VPC if the "DevEndpoint" is created within
                 one. The "PrivateAddress" field is present only when
                 you create the "DevEndpoint" within your VPC.

               * **ZeppelinRemoteSparkInterpreterPort** *(integer) --*

                 The Apache Zeppelin port for the remote Apache Spark
                 interpreter.

               * **PublicAddress** *(string) --*

                 The public IP address used by this "DevEndpoint". The
                 "PublicAddress" field is present only when you create
                 a non-virtual private cloud (VPC) "DevEndpoint".

               * **Status** *(string) --*

                 The current status of this "DevEndpoint".

               * **WorkerType** *(string) --*

                 The type of predefined worker that is allocated to
                 the development endpoint. Accepts a value of
                 Standard, G.1X, or G.2X.

                 * For the "Standard" worker type, each worker
                   provides 4 vCPU, 16 GB of memory and a 50GB disk,
                   and 2 executors per worker.

                 * For the "G.1X" worker type, each worker maps to 1
                   DPU (4 vCPU, 16 GB of memory, 64 GB disk), and
                   provides 1 executor per worker. We recommend this
                   worker type for memory-intensive jobs.

                 * For the "G.2X" worker type, each worker maps to 2
                   DPU (8 vCPU, 32 GB of memory, 128 GB disk), and
                   provides 1 executor per worker. We recommend this
                   worker type for memory-intensive jobs.

                 Known issue: when a development endpoint is created
                 with the "G.2X" "WorkerType" configuration, the Spark
                 drivers for the development endpoint will run on 4
                 vCPU, 16 GB of memory, and a 64 GB disk.

               * **GlueVersion** *(string) --*

                 Glue version determines the versions of Apache Spark
                 and Python that Glue supports. The Python version
                 indicates the version supported for running your ETL
                 scripts on development endpoints.

                 For more information about the available Glue
                 versions and corresponding Spark and Python versions,
                 see Glue version in the developer guide.

                 Development endpoints that are created without
                 specifying a Glue version default to Glue 0.9.

                 You can specify a version of Python support for
                 development endpoints by using the "Arguments"
                 parameter in the "CreateDevEndpoint" or
                 "UpdateDevEndpoint" APIs. If no arguments are
                 provided, the version defaults to Python 2.

               * **NumberOfWorkers** *(integer) --*

                 The number of workers of a defined "workerType" that
                 are allocated to the development endpoint.

                 The maximum number of workers you can define are 299
                 for "G.1X", and 149 for "G.2X".

               * **NumberOfNodes** *(integer) --*

                 The number of Glue Data Processing Units (DPUs)
                 allocated to this "DevEndpoint".

               * **AvailabilityZone** *(string) --*

                 The Amazon Web Services Availability Zone where this
                 "DevEndpoint" is located.

               * **VpcId** *(string) --*

                 The ID of the virtual private cloud (VPC) used by
                 this "DevEndpoint".

               * **ExtraPythonLibsS3Path** *(string) --*

                 The paths to one or more Python libraries in an
                 Amazon S3 bucket that should be loaded in your
                 "DevEndpoint". Multiple values must be complete paths
                 separated by a comma.

                 Note:

                   You can only use pure Python libraries with a
                   "DevEndpoint". Libraries that rely on C extensions,
                   such as the pandas Python data analysis library,
                   are not currently supported.

               * **ExtraJarsS3Path** *(string) --*

                 The path to one or more Java ".jar" files in an S3
                 bucket that should be loaded in your "DevEndpoint".

                 Note:

                   You can only use pure Java/Scala libraries with a
                   "DevEndpoint".

               * **FailureReason** *(string) --*

                 The reason for a current failure in this
                 "DevEndpoint".

               * **LastUpdateStatus** *(string) --*

                 The status of the last update.

               * **CreatedTimestamp** *(datetime) --*

                 The point in time at which this DevEndpoint was
                 created.

               * **LastModifiedTimestamp** *(datetime) --*

                 The point in time at which this "DevEndpoint" was
                 last modified.

               * **PublicKey** *(string) --*

                 The public key to be used by this "DevEndpoint" for
                 authentication. This attribute is provided for
                 backward compatibility because the recommended
                 attribute to use is public keys.

               * **PublicKeys** *(list) --*

                 A list of public keys to be used by the
                 "DevEndpoints" for authentication. Using this
                 attribute is preferred over a single public key
                 because the public keys allow you to have a different
                 private key per client.

                 Note:

                   If you previously created an endpoint with a public
                   key, you must remove that key to be able to set a
                   list of public keys. Call the "UpdateDevEndpoint"
                   API operation with the public key content in the
                   "deletePublicKeys" attribute, and the list of new
                   keys in the "addPublicKeys" attribute.

                 * *(string) --*

               * **SecurityConfiguration** *(string) --*

                 The name of the "SecurityConfiguration" structure to
                 be used with this "DevEndpoint".

               * **Arguments** *(dict) --*

                 A map of arguments used to configure the
                 "DevEndpoint".

                 Valid arguments are:

                 * ""--enable-glue-datacatalog": """

                 You can specify a version of Python support for
                 development endpoints by using the "Arguments"
                 parameter in the "CreateDevEndpoint" or
                 "UpdateDevEndpoint" APIs. If no arguments are
                 provided, the version defaults to Python 2.

                 * *(string) --*

                   * *(string) --*
Glue / Paginator / ListTableOptimizerRuns


ListTableOptimizerRuns
**********************

class Glue.Paginator.ListTableOptimizerRuns

      paginator = client.get_paginator('list_table_optimizer_runs')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_table_optimizer_runs()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             CatalogId='string',
             DatabaseName='string',
             TableName='string',
             Type='compaction'|'retention'|'orphan_file_deletion',
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **CatalogId** (*string*) --

           **[REQUIRED]**

           The Catalog ID of the table.

         * **DatabaseName** (*string*) --

           **[REQUIRED]**

           The name of the database in the catalog in which the table
           resides.

         * **TableName** (*string*) --

           **[REQUIRED]**

           The name of the table.

         * **Type** (*string*) --

           **[REQUIRED]**

           The type of table optimizer.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'CatalogId': 'string',
                'DatabaseName': 'string',
                'TableName': 'string',
                'TableOptimizerRuns': [
                    {
                        'eventType': 'starting'|'completed'|'failed'|'in_progress',
                        'startTimestamp': datetime(2015, 1, 1),
                        'endTimestamp': datetime(2015, 1, 1),
                        'metrics': {
                            'NumberOfBytesCompacted': 'string',
                            'NumberOfFilesCompacted': 'string',
                            'NumberOfDpus': 'string',
                            'JobDurationInHour': 'string'
                        },
                        'error': 'string',
                        'compactionMetrics': {
                            'IcebergMetrics': {
                                'NumberOfBytesCompacted': 123,
                                'NumberOfFilesCompacted': 123,
                                'DpuHours': 123.0,
                                'NumberOfDpus': 123,
                                'JobDurationInHour': 123.0
                            }
                        },
                        'compactionStrategy': 'binpack'|'sort'|'z-order',
                        'retentionMetrics': {
                            'IcebergMetrics': {
                                'NumberOfDataFilesDeleted': 123,
                                'NumberOfManifestFilesDeleted': 123,
                                'NumberOfManifestListsDeleted': 123,
                                'DpuHours': 123.0,
                                'NumberOfDpus': 123,
                                'JobDurationInHour': 123.0
                            }
                        },
                        'orphanFileDeletionMetrics': {
                            'IcebergMetrics': {
                                'NumberOfOrphanFilesDeleted': 123,
                                'DpuHours': 123.0,
                                'NumberOfDpus': 123,
                                'JobDurationInHour': 123.0
                            }
                        }
                    },
                ]
            }

         **Response Structure**

         * *(dict) --*

           * **CatalogId** *(string) --*

             The Catalog ID of the table.

           * **DatabaseName** *(string) --*

             The name of the database in the catalog in which the
             table resides.

           * **TableName** *(string) --*

             The name of the table.

           * **TableOptimizerRuns** *(list) --*

             A list of the optimizer runs associated with a table.

             * *(dict) --*

               Contains details for a table optimizer run.

               * **eventType** *(string) --*

                 An event type representing the status of the table
                 optimizer run.

               * **startTimestamp** *(datetime) --*

                 Represents the epoch timestamp at which the
                 compaction job was started within Lake Formation.

               * **endTimestamp** *(datetime) --*

                 Represents the epoch timestamp at which the
                 compaction job ended.

               * **metrics** *(dict) --*

                 A "RunMetrics" object containing metrics for the
                 optimizer run.

                 This member is deprecated. See the individual metric
                 members for compaction, retention, and orphan file
                 deletion.

                 * **NumberOfBytesCompacted** *(string) --*

                   The number of bytes removed by the compaction job
                   run.

                 * **NumberOfFilesCompacted** *(string) --*

                   The number of files removed by the compaction job
                   run.

                 * **NumberOfDpus** *(string) --*

                   The number of DPUs consumed by the job, rounded up
                   to the nearest whole number.

                 * **JobDurationInHour** *(string) --*

                   The duration of the job in hours.

               * **error** *(string) --*

                 An error that occured during the optimizer run.

               * **compactionMetrics** *(dict) --*

                 A "CompactionMetrics" object containing metrics for
                 the optimizer run.

                 * **IcebergMetrics** *(dict) --*

                   A structure containing the Iceberg compaction
                   metrics for the optimizer run.

                   * **NumberOfBytesCompacted** *(integer) --*

                     The number of bytes removed by the compaction job
                     run.

                   * **NumberOfFilesCompacted** *(integer) --*

                     The number of files removed by the compaction job
                     run.

                   * **DpuHours** *(float) --*

                     The number of DPU hours consumed by the job.

                   * **NumberOfDpus** *(integer) --*

                     The number of DPUs consumed by the job, rounded
                     up to the nearest whole number.

                   * **JobDurationInHour** *(float) --*

                     The duration of the job in hours.

               * **compactionStrategy** *(string) --*

                 The strategy used for the compaction run. Indicates
                 which algorithm was applied to determine how files
                 were selected and combined during the compaction
                 process. Valid values are:

                 * "binpack": Combines small files into larger files,
                   typically targeting sizes over 100MB, while
                   applying any pending deletes. This is the
                   recommended compaction strategy for most use cases.

                 * "sort": Organizes data based on specified columns
                   which are sorted hierarchically during compaction,
                   improving query performance for filtered
                   operations. This strategy is recommended when your
                   queries frequently filter on specific columns. To
                   use this strategy, you must first define a sort
                   order in your Iceberg table properties using the
                   "sort_order" table property.

                 * "z-order": Optimizes data organization by blending
                   multiple attributes into a single scalar value that
                   can be used for sorting, allowing efficient
                   querying across multiple dimensions. This strategy
                   is recommended when you need to query data across
                   multiple dimensions simultaneously. To use this
                   strategy, you must first define a sort order in
                   your Iceberg table properties using the
                   "sort_order" table property.

               * **retentionMetrics** *(dict) --*

                 A "RetentionMetrics" object containing metrics for
                 the optimizer run.

                 * **IcebergMetrics** *(dict) --*

                   A structure containing the Iceberg retention
                   metrics for the optimizer run.

                   * **NumberOfDataFilesDeleted** *(integer) --*

                     The number of data files deleted by the retention
                     job run.

                   * **NumberOfManifestFilesDeleted** *(integer) --*

                     The number of manifest files deleted by the
                     retention job run.

                   * **NumberOfManifestListsDeleted** *(integer) --*

                     The number of manifest lists deleted by the
                     retention job run.

                   * **DpuHours** *(float) --*

                     The number of DPU hours consumed by the job.

                   * **NumberOfDpus** *(integer) --*

                     The number of DPUs consumed by the job, rounded
                     up to the nearest whole number.

                   * **JobDurationInHour** *(float) --*

                     The duration of the job in hours.

               * **orphanFileDeletionMetrics** *(dict) --*

                 An "OrphanFileDeletionMetrics" object containing
                 metrics for the optimizer run.

                 * **IcebergMetrics** *(dict) --*

                   A structure containing the Iceberg orphan file
                   deletion metrics for the optimizer run.

                   * **NumberOfOrphanFilesDeleted** *(integer) --*

                     The number of orphan files deleted by the orphan
                     file deletion job run.

                   * **DpuHours** *(float) --*

                     The number of DPU hours consumed by the job.

                   * **NumberOfDpus** *(integer) --*

                     The number of DPUs consumed by the job, rounded
                     up to the nearest whole number.

                   * **JobDurationInHour** *(float) --*

                     The duration of the job in hours.
Glue / Paginator / GetConnections


GetConnections
**************

class Glue.Paginator.GetConnections

      paginator = client.get_paginator('get_connections')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_connections()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             CatalogId='string',
             Filter={
                 'MatchCriteria': [
                     'string',
                 ],
                 'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
                 'ConnectionSchemaVersion': 123
             },
             HidePassword=True|False,
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **CatalogId** (*string*) -- The ID of the Data Catalog in
           which the connections reside. If none is provided, the
           Amazon Web Services account ID is used by default.

         * **Filter** (*dict*) --

           A filter that controls which connections are returned.

           * **MatchCriteria** *(list) --*

             A criteria string that must match the criteria recorded
             in the connection definition for that connection
             definition to be returned.

             * *(string) --*

           * **ConnectionType** *(string) --*

             The type of connections to return. Currently, SFTP is not
             supported.

           * **ConnectionSchemaVersion** *(integer) --*

             Denotes if the connection was created with schema version
             1 or 2.

         * **HidePassword** (*boolean*) -- Allows you to retrieve the
           connection metadata without returning the password. For
           instance, the Glue console uses this flag to retrieve the
           connection, and does not display the password. Set this
           parameter when the caller might not have permission to use
           the KMS key to decrypt the password, but it does have
           permission to access the rest of the connection properties.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'ConnectionList': [
                    {
                        'Name': 'string',
                        'Description': 'string',
                        'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
                        'MatchCriteria': [
                            'string',
                        ],
                        'ConnectionProperties': {
                            'string': 'string'
                        },
                        'SparkProperties': {
                            'string': 'string'
                        },
                        'AthenaProperties': {
                            'string': 'string'
                        },
                        'PythonProperties': {
                            'string': 'string'
                        },
                        'PhysicalConnectionRequirements': {
                            'SubnetId': 'string',
                            'SecurityGroupIdList': [
                                'string',
                            ],
                            'AvailabilityZone': 'string'
                        },
                        'CreationTime': datetime(2015, 1, 1),
                        'LastUpdatedTime': datetime(2015, 1, 1),
                        'LastUpdatedBy': 'string',
                        'Status': 'READY'|'IN_PROGRESS'|'FAILED',
                        'StatusReason': 'string',
                        'LastConnectionValidationTime': datetime(2015, 1, 1),
                        'AuthenticationConfiguration': {
                            'AuthenticationType': 'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                            'SecretArn': 'string',
                            'OAuth2Properties': {
                                'OAuth2GrantType': 'AUTHORIZATION_CODE'|'CLIENT_CREDENTIALS'|'JWT_BEARER',
                                'OAuth2ClientApplication': {
                                    'UserManagedClientApplicationClientId': 'string',
                                    'AWSManagedClientApplicationReference': 'string'
                                },
                                'TokenUrl': 'string',
                                'TokenUrlParametersMap': {
                                    'string': 'string'
                                }
                            }
                        },
                        'ConnectionSchemaVersion': 123,
                        'CompatibleComputeEnvironments': [
                            'SPARK'|'ATHENA'|'PYTHON',
                        ]
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **ConnectionList** *(list) --*

             A list of requested connection definitions.

             * *(dict) --*

               Defines a connection to a data source.

               * **Name** *(string) --*

                 The name of the connection definition.

               * **Description** *(string) --*

                 The description of the connection.

               * **ConnectionType** *(string) --*

                 The type of the connection. Currently, SFTP is not
                 supported.

               * **MatchCriteria** *(list) --*

                 A list of criteria that can be used in selecting this
                 connection.

                 * *(string) --*

               * **ConnectionProperties** *(dict) --*

                 These key-value pairs define parameters for the
                 connection when using the version 1 Connection
                 schema:

                 * "HOST" - The host URI: either the fully qualified
                   domain name (FQDN) or the IPv4 address of the
                   database host.

                 * "PORT" - The port number, between 1024 and 65535,
                   of the port on which the database host is listening
                   for database connections.

                 * "USER_NAME" - The name under which to log in to the
                   database. The value string for "USER_NAME" is "
                   "USERNAME"".

                 * "PASSWORD" - A password, if one is used, for the
                   user name.

                 * "ENCRYPTED_PASSWORD" - When you enable connection
                   password protection by setting
                   "ConnectionPasswordEncryption" in the Data Catalog
                   encryption settings, this field stores the
                   encrypted password.

                 * "JDBC_DRIVER_JAR_URI" - The Amazon Simple Storage
                   Service (Amazon S3) path of the JAR file that
                   contains the JDBC driver to use.

                 * "JDBC_DRIVER_CLASS_NAME" - The class name of the
                   JDBC driver to use.

                 * "JDBC_ENGINE" - The name of the JDBC engine to use.

                 * "JDBC_ENGINE_VERSION" - The version of the JDBC
                   engine to use.

                 * "CONFIG_FILES" - (Reserved for future use.)

                 * "INSTANCE_ID" - The instance ID to use.

                 * "JDBC_CONNECTION_URL" - The URL for connecting to a
                   JDBC data source.

                 * "JDBC_ENFORCE_SSL" - A Boolean string (true, false)
                   specifying whether Secure Sockets Layer (SSL) with
                   hostname matching is enforced for the JDBC
                   connection on the client. The default is false.

                 * "CUSTOM_JDBC_CERT" - An Amazon S3 location
                   specifying the customer's root certificate. Glue
                   uses this root certificate to validate the
                   customer’s certificate when connecting to the
                   customer database. Glue only handles X.509
                   certificates. The certificate provided must be DER-
                   encoded and supplied in Base64 encoding PEM format.

                 * "SKIP_CUSTOM_JDBC_CERT_VALIDATION" - By default,
                   this is "false". Glue validates the Signature
                   algorithm and Subject Public Key Algorithm for the
                   customer certificate. The only permitted algorithms
                   for the Signature algorithm are SHA256withRSA,
                   SHA384withRSA or SHA512withRSA. For the Subject
                   Public Key Algorithm, the key length must be at
                   least 2048. You can set the value of this property
                   to "true" to skip Glue’s validation of the customer
                   certificate.

                 * "CUSTOM_JDBC_CERT_STRING" - A custom JDBC
                   certificate string which is used for domain match
                   or distinguished name match to prevent a man-in-
                   the-middle attack. In Oracle database, this is used
                   as the "SSL_SERVER_CERT_DN"; in Microsoft SQL
                   Server, this is used as the
                   "hostNameInCertificate".

                 * "CONNECTION_URL" - The URL for connecting to a
                   general (non-JDBC) data source.

                 * "SECRET_ID" - The secret ID used for the secret
                   manager of credentials.

                 * "CONNECTOR_URL" - The connector URL for a
                   MARKETPLACE or CUSTOM connection.

                 * "CONNECTOR_TYPE" - The connector type for a
                   MARKETPLACE or CUSTOM connection.

                 * "CONNECTOR_CLASS_NAME" - The connector class name
                   for a MARKETPLACE or CUSTOM connection.

                 * "KAFKA_BOOTSTRAP_SERVERS" - A comma-separated list
                   of host and port pairs that are the addresses of
                   the Apache Kafka brokers in a Kafka cluster to
                   which a Kafka client will connect to and bootstrap
                   itself.

                 * "KAFKA_SSL_ENABLED" - Whether to enable or disable
                   SSL on an Apache Kafka connection. Default value is
                   "true".

                 * "KAFKA_CUSTOM_CERT" - The Amazon S3 URL for the
                   private CA cert file (.pem format). The default is
                   an empty string.

                 * "KAFKA_SKIP_CUSTOM_CERT_VALIDATION" - Whether to
                   skip the validation of the CA cert file or not.
                   Glue validates for three algorithms: SHA256withRSA,
                   SHA384withRSA and SHA512withRSA. Default value is
                   "false".

                 * "KAFKA_CLIENT_KEYSTORE" - The Amazon S3 location of
                   the client keystore file for Kafka client side
                   authentication (Optional).

                 * "KAFKA_CLIENT_KEYSTORE_PASSWORD" - The password to
                   access the provided keystore (Optional).

                 * "KAFKA_CLIENT_KEY_PASSWORD" - A keystore can
                   consist of multiple keys, so this is the password
                   to access the client key to be used with the Kafka
                   server side key (Optional).

                 * "ENCRYPTED_KAFKA_CLIENT_KEYSTORE_PASSWORD" - The
                   encrypted version of the Kafka client keystore
                   password (if the user has the Glue encrypt
                   passwords setting selected).

                 * "ENCRYPTED_KAFKA_CLIENT_KEY_PASSWORD" - The
                   encrypted version of the Kafka client key password
                   (if the user has the Glue encrypt passwords setting
                   selected).

                 * "KAFKA_SASL_MECHANISM" - ""SCRAM-SHA-512"",
                   ""GSSAPI"", ""AWS_MSK_IAM"", or ""PLAIN"". These
                   are the supported SASL Mechanisms.

                 * "KAFKA_SASL_PLAIN_USERNAME" - A plaintext username
                   used to authenticate with the "PLAIN" mechanism.

                 * "KAFKA_SASL_PLAIN_PASSWORD" - A plaintext password
                   used to authenticate with the "PLAIN" mechanism.

                 * "ENCRYPTED_KAFKA_SASL_PLAIN_PASSWORD" - The
                   encrypted version of the Kafka SASL PLAIN password
                   (if the user has the Glue encrypt passwords setting
                   selected).

                 * "KAFKA_SASL_SCRAM_USERNAME" - A plaintext username
                   used to authenticate with the "SCRAM-SHA-512"
                   mechanism.

                 * "KAFKA_SASL_SCRAM_PASSWORD" - A plaintext password
                   used to authenticate with the "SCRAM-SHA-512"
                   mechanism.

                 * "ENCRYPTED_KAFKA_SASL_SCRAM_PASSWORD" - The
                   encrypted version of the Kafka SASL SCRAM password
                   (if the user has the Glue encrypt passwords setting
                   selected).

                 * "KAFKA_SASL_SCRAM_SECRETS_ARN" - The Amazon
                   Resource Name of a secret in Amazon Web Services
                   Secrets Manager.

                 * "KAFKA_SASL_GSSAPI_KEYTAB" - The S3 location of a
                   Kerberos "keytab" file. A keytab stores long-term
                   keys for one or more principals. For more
                   information, see MIT Kerberos Documentation:
                   Keytab.

                 * "KAFKA_SASL_GSSAPI_KRB5_CONF" - The S3 location of
                   a Kerberos "krb5.conf" file. A krb5.conf stores
                   Kerberos configuration information, such as the
                   location of the KDC server. For more information,
                   see MIT Kerberos Documentation: krb5.conf.

                 * "KAFKA_SASL_GSSAPI_SERVICE" - The Kerberos service
                   name, as set with "sasl.kerberos.service.name" in
                   your Kafka Configuration.

                 * "KAFKA_SASL_GSSAPI_PRINCIPAL" - The name of the
                   Kerberos princial used by Glue. For more
                   information, see Kafka Documentation: Configuring
                   Kafka Brokers.

                 * "ROLE_ARN" - The role to be used for running
                   queries.

                 * "REGION" - The Amazon Web Services Region where
                   queries will be run.

                 * "WORKGROUP_NAME" - The name of an Amazon Redshift
                   serverless workgroup or Amazon Athena workgroup in
                   which queries will run.

                 * "CLUSTER_IDENTIFIER" - The cluster identifier of an
                   Amazon Redshift cluster in which queries will run.

                 * "DATABASE" - The Amazon Redshift database that you
                   are connecting to.

                 * *(string) --*

                   * *(string) --*

               * **SparkProperties** *(dict) --*

                 Connection properties specific to the Spark compute
                 environment.

                 * *(string) --*

                   * *(string) --*

               * **AthenaProperties** *(dict) --*

                 Connection properties specific to the Athena compute
                 environment.

                 * *(string) --*

                   * *(string) --*

               * **PythonProperties** *(dict) --*

                 Connection properties specific to the Python compute
                 environment.

                 * *(string) --*

                   * *(string) --*

               * **PhysicalConnectionRequirements** *(dict) --*

                 The physical connection requirements, such as virtual
                 private cloud (VPC) and "SecurityGroup", that are
                 needed to make this connection successfully.

                 * **SubnetId** *(string) --*

                   The subnet ID used by the connection.

                 * **SecurityGroupIdList** *(list) --*

                   The security group ID list used by the connection.

                   * *(string) --*

                 * **AvailabilityZone** *(string) --*

                   The connection's Availability Zone.

               * **CreationTime** *(datetime) --*

                 The timestamp of the time that this connection
                 definition was created.

               * **LastUpdatedTime** *(datetime) --*

                 The timestamp of the last time the connection
                 definition was updated.

               * **LastUpdatedBy** *(string) --*

                 The user, group, or role that last updated this
                 connection definition.

               * **Status** *(string) --*

                 The status of the connection. Can be one of: "READY",
                 "IN_PROGRESS", or "FAILED".

               * **StatusReason** *(string) --*

                 The reason for the connection status.

               * **LastConnectionValidationTime** *(datetime) --*

                 A timestamp of the time this connection was last
                 validated.

               * **AuthenticationConfiguration** *(dict) --*

                 The authentication properties of the connection.

                 * **AuthenticationType** *(string) --*

                   A structure containing the authentication
                   configuration.

                 * **SecretArn** *(string) --*

                   The secret manager ARN to store credentials.

                 * **OAuth2Properties** *(dict) --*

                   The properties for OAuth2 authentication.

                   * **OAuth2GrantType** *(string) --*

                     The OAuth2 grant type. For example,
                     "AUTHORIZATION_CODE", "JWT_BEARER", or
                     "CLIENT_CREDENTIALS".

                   * **OAuth2ClientApplication** *(dict) --*

                     The client application type. For example,
                     AWS_MANAGED or USER_MANAGED.

                     * **UserManagedClientApplicationClientId**
                       *(string) --*

                       The client application clientID if the
                       ClientAppType is "USER_MANAGED".

                     * **AWSManagedClientApplicationReference**
                       *(string) --*

                       The reference to the SaaS-side client app that
                       is Amazon Web Services managed.

                   * **TokenUrl** *(string) --*

                     The URL of the provider's authentication server,
                     to exchange an authorization code for an access
                     token.

                   * **TokenUrlParametersMap** *(dict) --*

                     A map of parameters that are added to the token
                     "GET" request.

                     * *(string) --*

                       * *(string) --*

               * **ConnectionSchemaVersion** *(integer) --*

                 The version of the connection schema for this
                 connection. Version 2 supports properties for
                 specific compute environments.

               * **CompatibleComputeEnvironments** *(list) --*

                 A list of compute environments compatible with the
                 connection.

                 * *(string) --*
Glue / Paginator / GetJobRuns


GetJobRuns
**********

class Glue.Paginator.GetJobRuns

      paginator = client.get_paginator('get_job_runs')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_job_runs()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             JobName='string',
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **JobName** (*string*) --

           **[REQUIRED]**

           The name of the job definition for which to retrieve all
           job runs.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'JobRuns': [
                    {
                        'Id': 'string',
                        'Attempt': 123,
                        'PreviousRunId': 'string',
                        'TriggerName': 'string',
                        'JobName': 'string',
                        'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                        'JobRunQueuingEnabled': True|False,
                        'StartedOn': datetime(2015, 1, 1),
                        'LastModifiedOn': datetime(2015, 1, 1),
                        'CompletedOn': datetime(2015, 1, 1),
                        'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                        'Arguments': {
                            'string': 'string'
                        },
                        'ErrorMessage': 'string',
                        'PredecessorRuns': [
                            {
                                'JobName': 'string',
                                'RunId': 'string'
                            },
                        ],
                        'AllocatedCapacity': 123,
                        'ExecutionTime': 123,
                        'Timeout': 123,
                        'MaxCapacity': 123.0,
                        'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                        'NumberOfWorkers': 123,
                        'SecurityConfiguration': 'string',
                        'LogGroupName': 'string',
                        'NotificationProperty': {
                            'NotifyDelayAfter': 123
                        },
                        'GlueVersion': 'string',
                        'DPUSeconds': 123.0,
                        'ExecutionClass': 'FLEX'|'STANDARD',
                        'MaintenanceWindow': 'string',
                        'ProfileName': 'string',
                        'StateDetail': 'string',
                        'ExecutionRoleSessionPolicy': 'string'
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **JobRuns** *(list) --*

             A list of job-run metadata objects.

             * *(dict) --*

               Contains information about a job run.

               * **Id** *(string) --*

                 The ID of this job run.

               * **Attempt** *(integer) --*

                 The number of the attempt to run this job.

               * **PreviousRunId** *(string) --*

                 The ID of the previous run of this job. For example,
                 the "JobRunId" specified in the "StartJobRun" action.

               * **TriggerName** *(string) --*

                 The name of the trigger that started this job run.

               * **JobName** *(string) --*

                 The name of the job definition being used in this
                 run.

               * **JobMode** *(string) --*

                 A mode that describes how a job was created. Valid
                 values are:

                 * "SCRIPT" - The job was created using the Glue
                   Studio script editor.

                 * "VISUAL" - The job was created using the Glue
                   Studio visual editor.

                 * "NOTEBOOK" - The job was created using an
                   interactive sessions notebook.

                 When the "JobMode" field is missing or null, "SCRIPT"
                 is assigned as the default value.

               * **JobRunQueuingEnabled** *(boolean) --*

                 Specifies whether job run queuing is enabled for the
                 job run.

                 A value of true means job run queuing is enabled for
                 the job run. If false or not populated, the job run
                 will not be considered for queueing.

               * **StartedOn** *(datetime) --*

                 The date and time at which this job run was started.

               * **LastModifiedOn** *(datetime) --*

                 The last time that this job run was modified.

               * **CompletedOn** *(datetime) --*

                 The date and time that this job run completed.

               * **JobRunState** *(string) --*

                 The current state of the job run. For more
                 information about the statuses of jobs that have
                 terminated abnormally, see Glue Job Run Statuses.

               * **Arguments** *(dict) --*

                 The job arguments associated with this run. For this
                 job run, they replace the default arguments set in
                 the job definition itself.

                 You can specify arguments here that your own job-
                 execution script consumes, as well as arguments that
                 Glue itself consumes.

                 Job arguments may be logged. Do not pass plaintext
                 secrets as arguments. Retrieve secrets from a Glue
                 Connection, Secrets Manager or other secret
                 management mechanism if you intend to keep them
                 within the Job.

                 For information about how to specify and consume your
                 own Job arguments, see the Calling Glue APIs in
                 Python topic in the developer guide.

                 For information about the arguments you can provide
                 to this field when configuring Spark jobs, see the
                 Special Parameters Used by Glue topic in the
                 developer guide.

                 For information about the arguments you can provide
                 to this field when configuring Ray jobs, see Using
                 job parameters in Ray jobs in the developer guide.

                 * *(string) --*

                   * *(string) --*

               * **ErrorMessage** *(string) --*

                 An error message associated with this job run.

               * **PredecessorRuns** *(list) --*

                 A list of predecessors to this job run.

                 * *(dict) --*

                   A job run that was used in the predicate of a
                   conditional trigger that triggered this job run.

                   * **JobName** *(string) --*

                     The name of the job definition used by the
                     predecessor job run.

                   * **RunId** *(string) --*

                     The job-run ID of the predecessor job run.

               * **AllocatedCapacity** *(integer) --*

                 This field is deprecated. Use "MaxCapacity" instead.

                 The number of Glue data processing units (DPUs)
                 allocated to this JobRun. From 2 to 100 DPUs can be
                 allocated; the default is 10. A DPU is a relative
                 measure of processing power that consists of 4 vCPUs
                 of compute capacity and 16 GB of memory. For more
                 information, see the Glue pricing page.

               * **ExecutionTime** *(integer) --*

                 The amount of time (in seconds) that the job run
                 consumed resources.

               * **Timeout** *(integer) --*

                 The "JobRun" timeout in minutes. This is the maximum
                 time that a job run can consume resources before it
                 is terminated and enters "TIMEOUT" status. This value
                 overrides the timeout value set in the parent job.

                 Jobs must have timeout values less than 7 days or
                 10080 minutes. Otherwise, the jobs will throw an
                 exception.

                 When the value is left blank, the timeout is
                 defaulted to 2880 minutes.

                 Any existing Glue jobs that had a timeout value
                 greater than 7 days will be defaulted to 7 days. For
                 instance if you have specified a timeout of 20 days
                 for a batch job, it will be stopped on the 7th day.

                 For streaming jobs, if you have set up a maintenance
                 window, it will be restarted during the maintenance
                 window after 7 days.

               * **MaxCapacity** *(float) --*

                 For Glue version 1.0 or earlier jobs, using the
                 standard worker type, the number of Glue data
                 processing units (DPUs) that can be allocated when
                 this job runs. A DPU is a relative measure of
                 processing power that consists of 4 vCPUs of compute
                 capacity and 16 GB of memory. For more information,
                 see the Glue pricing page.

                 For Glue version 2.0+ jobs, you cannot specify a
                 "Maximum capacity". Instead, you should specify a
                 "Worker type" and the "Number of workers".

                 Do not set "MaxCapacity" if using "WorkerType" and
                 "NumberOfWorkers".

                 The value that can be allocated for "MaxCapacity"
                 depends on whether you are running a Python shell
                 job, an Apache Spark ETL job, or an Apache Spark
                 streaming ETL job:

                 * When you specify a Python shell job (
                   >>``<<JobCommand.Name``="pythonshell"), you can
                   allocate either 0.0625 or 1 DPU. The default is
                   0.0625 DPU.

                 * When you specify an Apache Spark ETL job (
                   >>``<<JobCommand.Name``="glueetl") or Apache Spark
                   streaming ETL job (
                   >>``<<JobCommand.Name``="gluestreaming"), you can
                   allocate from 2 to 100 DPUs. The default is 10
                   DPUs. This job type cannot have a fractional DPU
                   allocation.

               * **WorkerType** *(string) --*

                 The type of predefined worker that is allocated when
                 a job runs. Accepts a value of G.1X, G.2X, G.4X, G.8X
                 or G.025X for Spark jobs. Accepts the value Z.2X for
                 Ray jobs.

                 * For the "G.1X" worker type, each worker maps to 1
                   DPU (4 vCPUs, 16 GB of memory) with 94GB disk, and
                   provides 1 executor per worker. We recommend this
                   worker type for workloads such as data transforms,
                   joins, and queries, to offers a scalable and cost
                   effective way to run most jobs.

                 * For the "G.2X" worker type, each worker maps to 2
                   DPU (8 vCPUs, 32 GB of memory) with 138GB disk, and
                   provides 1 executor per worker. We recommend this
                   worker type for workloads such as data transforms,
                   joins, and queries, to offers a scalable and cost
                   effective way to run most jobs.

                 * For the "G.4X" worker type, each worker maps to 4
                   DPU (16 vCPUs, 64 GB of memory) with 256GB disk,
                   and provides 1 executor per worker. We recommend
                   this worker type for jobs whose workloads contain
                   your most demanding transforms, aggregations,
                   joins, and queries. This worker type is available
                   only for Glue version 3.0 or later Spark ETL jobs
                   in the following Amazon Web Services Regions: US
                   East (Ohio), US East (N. Virginia), US West
                   (Oregon), Asia Pacific (Singapore), Asia Pacific
                   (Sydney), Asia Pacific (Tokyo), Canada (Central),
                   Europe (Frankfurt), Europe (Ireland), and Europe
                   (Stockholm).

                 * For the "G.8X" worker type, each worker maps to 8
                   DPU (32 vCPUs, 128 GB of memory) with 512GB disk,
                   and provides 1 executor per worker. We recommend
                   this worker type for jobs whose workloads contain
                   your most demanding transforms, aggregations,
                   joins, and queries. This worker type is available
                   only for Glue version 3.0 or later Spark ETL jobs,
                   in the same Amazon Web Services Regions as
                   supported for the "G.4X" worker type.

                 * For the "G.025X" worker type, each worker maps to
                   0.25 DPU (2 vCPUs, 4 GB of memory) with 84GB disk,
                   and provides 1 executor per worker. We recommend
                   this worker type for low volume streaming jobs.
                   This worker type is only available for Glue version
                   3.0 or later streaming jobs.

                 * For the "Z.2X" worker type, each worker maps to 2
                   M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk,
                   and provides up to 8 Ray workers based on the
                   autoscaler.

               * **NumberOfWorkers** *(integer) --*

                 The number of workers of a defined "workerType" that
                 are allocated when a job runs.

               * **SecurityConfiguration** *(string) --*

                 The name of the "SecurityConfiguration" structure to
                 be used with this job run.

               * **LogGroupName** *(string) --*

                 The name of the log group for secure logging that can
                 be server-side encrypted in Amazon CloudWatch using
                 KMS. This name can be "/aws-glue/jobs/", in which
                 case the default encryption is "NONE". If you add a
                 role name and "SecurityConfiguration" name (in other
                 words, "/aws-glue/jobs-yourRoleName-
                 yourSecurityConfigurationName/"), then that security
                 configuration is used to encrypt the log group.

               * **NotificationProperty** *(dict) --*

                 Specifies configuration properties of a job run
                 notification.

                 * **NotifyDelayAfter** *(integer) --*

                   After a job run starts, the number of minutes to
                   wait before sending a job run delay notification.

               * **GlueVersion** *(string) --*

                 In Spark jobs, "GlueVersion" determines the versions
                 of Apache Spark and Python that Glue available in a
                 job. The Python version indicates the version
                 supported for jobs of type Spark.

                 Ray jobs should set "GlueVersion" to "4.0" or
                 greater. However, the versions of Ray, Python and
                 additional libraries available in your Ray job are
                 determined by the "Runtime" parameter of the Job
                 command.

                 For more information about the available Glue
                 versions and corresponding Spark and Python versions,
                 see Glue version in the developer guide.

                 Jobs that are created without specifying a Glue
                 version default to Glue 0.9.

               * **DPUSeconds** *(float) --*

                 This field can be set for either job runs with
                 execution class "FLEX" or when Auto Scaling is
                 enabled, and represents the total time each executor
                 ran during the lifecycle of a job run in seconds,
                 multiplied by a DPU factor (1 for "G.1X", 2 for
                 "G.2X", or 0.25 for "G.025X" workers). This value may
                 be different than the "executionEngineRuntime" *
                 "MaxCapacity" as in the case of Auto Scaling jobs, as
                 the number of executors running at a given time may
                 be less than the "MaxCapacity". Therefore, it is
                 possible that the value of "DPUSeconds" is less than
                 "executionEngineRuntime" * "MaxCapacity".

               * **ExecutionClass** *(string) --*

                 Indicates whether the job is run with a standard or
                 flexible execution class. The standard execution-
                 class is ideal for time-sensitive workloads that
                 require fast job startup and dedicated resources.

                 The flexible execution class is appropriate for time-
                 insensitive jobs whose start and completion times may
                 vary.

                 Only jobs with Glue version 3.0 and above and command
                 type "glueetl" will be allowed to set
                 "ExecutionClass" to "FLEX". The flexible execution
                 class is available for Spark jobs.

               * **MaintenanceWindow** *(string) --*

                 This field specifies a day of the week and hour for a
                 maintenance window for streaming jobs. Glue
                 periodically performs maintenance activities. During
                 these maintenance windows, Glue will need to restart
                 your streaming jobs.

                 Glue will restart the job within 3 hours of the
                 specified maintenance window. For instance, if you
                 set up the maintenance window for Monday at 10:00AM
                 GMT, your jobs will be restarted between 10:00AM GMT
                 to 1:00PM GMT.

               * **ProfileName** *(string) --*

                 The name of an Glue usage profile associated with the
                 job run.

               * **StateDetail** *(string) --*

                 This field holds details that pertain to the state of
                 a job run. The field is nullable.

                 For example, when a job run is in a WAITING state as
                 a result of job run queuing, the field has the reason
                 why the job run is in that state.

               * **ExecutionRoleSessionPolicy** *(string) --*

                 This inline session policy to the StartJobRun API
                 allows you to dynamically restrict the permissions of
                 the specified execution role for the scope of the
                 job, without requiring the creation of additional IAM
                 roles.
Glue / Paginator / ListConnectionTypes


ListConnectionTypes
*******************

class Glue.Paginator.ListConnectionTypes

      paginator = client.get_paginator('list_connection_types')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_connection_types()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'ConnectionTypes': [
                    {
                        'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
                        'DisplayName': 'string',
                        'Vendor': 'string',
                        'Description': 'string',
                        'Categories': [
                            'string',
                        ],
                        'Capabilities': {
                            'SupportedAuthenticationTypes': [
                                'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                            ],
                            'SupportedDataOperations': [
                                'READ'|'WRITE',
                            ],
                            'SupportedComputeEnvironments': [
                                'SPARK'|'ATHENA'|'PYTHON',
                            ]
                        },
                        'LogoUrl': 'string',
                        'ConnectionTypeVariants': [
                            {
                                'ConnectionTypeVariantName': 'string',
                                'DisplayName': 'string',
                                'Description': 'string',
                                'LogoUrl': 'string'
                            },
                        ]
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **ConnectionTypes** *(list) --*

             A list of "ConnectionTypeBrief" objects containing brief
             information about the supported connection types.

             * *(dict) --*

               Brief information about a supported connection type
               returned by the "ListConnectionTypes" API.

               * **ConnectionType** *(string) --*

                 The name of the connection type.

               * **DisplayName** *(string) --*

                 The human-readable name for the connection type that
                 is displayed in the Glue console.

               * **Vendor** *(string) --*

                 The name of the vendor or provider that created or
                 maintains this connection type.

               * **Description** *(string) --*

                 A description of the connection type.

               * **Categories** *(list) --*

                 A list of categories that this connection type
                 belongs to. Categories help users filter and find
                 appropriate connection types based on their use
                 cases.

                 * *(string) --*

               * **Capabilities** *(dict) --*

                 The supported authentication types, data interface
                 types (compute environments), and data operations of
                 the connector.

                 * **SupportedAuthenticationTypes** *(list) --*

                   A list of supported authentication types.

                   * *(string) --*

                 * **SupportedDataOperations** *(list) --*

                   A list of supported data operations.

                   * *(string) --*

                 * **SupportedComputeEnvironments** *(list) --*

                   A list of supported compute environments.

                   * *(string) --*

               * **LogoUrl** *(string) --*

                 The URL of the logo associated with a connection
                 type.

               * **ConnectionTypeVariants** *(list) --*

                 A list of variants available for this connection
                 type. Different variants may provide specialized
                 configurations for specific use cases or
                 implementations of the same general connection type.

                 * *(dict) --*

                   Represents a variant of a connection type in Glue
                   Data Catalog. Connection type variants provide
                   specific configurations and behaviors for different
                   implementations of the same general connection
                   type.

                   * **ConnectionTypeVariantName** *(string) --*

                     The unique identifier for the connection type
                     variant. This name is used internally to identify
                     the specific variant of a connection type.

                   * **DisplayName** *(string) --*

                     The human-readable name for the connection type
                     variant that is displayed in the Glue console.

                   * **Description** *(string) --*

                     A detailed description of the connection type
                     variant, including its purpose, use cases, and
                     any specific configuration requirements.

                   * **LogoUrl** *(string) --*

                     The URL of the logo associated with a connection
                     type variant.
Glue / Paginator / ListWorkflows


ListWorkflows
*************

class Glue.Paginator.ListWorkflows

      paginator = client.get_paginator('list_workflows')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_workflows()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Workflows': [
                    'string',
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Workflows** *(list) --*

             List of names of workflows in the account.

             * *(string) --*
Glue / Paginator / ListUsageProfiles


ListUsageProfiles
*****************

class Glue.Paginator.ListUsageProfiles

      paginator = client.get_paginator('list_usage_profiles')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_usage_profiles()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Profiles': [
                    {
                        'Name': 'string',
                        'Description': 'string',
                        'CreatedOn': datetime(2015, 1, 1),
                        'LastModifiedOn': datetime(2015, 1, 1)
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Profiles** *(list) --*

             A list of usage profile ( "UsageProfileDefinition")
             objects.

             * *(dict) --*

               Describes an Glue usage profile.

               * **Name** *(string) --*

                 The name of the usage profile.

               * **Description** *(string) --*

                 A description of the usage profile.

               * **CreatedOn** *(datetime) --*

                 The date and time when the usage profile was created.

               * **LastModifiedOn** *(datetime) --*

                 The date and time when the usage profile was last
                 modified.
Glue / Paginator / GetUserDefinedFunctions


GetUserDefinedFunctions
***********************

class Glue.Paginator.GetUserDefinedFunctions

      paginator = client.get_paginator('get_user_defined_functions')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_user_defined_functions()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             CatalogId='string',
             DatabaseName='string',
             Pattern='string',
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **CatalogId** (*string*) -- The ID of the Data Catalog
           where the functions to be retrieved are located. If none is
           provided, the Amazon Web Services account ID is used by
           default.

         * **DatabaseName** (*string*) -- The name of the catalog
           database where the functions are located. If none is
           provided, functions from all the databases across the
           catalog will be returned.

         * **Pattern** (*string*) --

           **[REQUIRED]**

           An optional function-name pattern string that filters the
           function definitions returned.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'UserDefinedFunctions': [
                    {
                        'FunctionName': 'string',
                        'DatabaseName': 'string',
                        'ClassName': 'string',
                        'OwnerName': 'string',
                        'OwnerType': 'USER'|'ROLE'|'GROUP',
                        'CreateTime': datetime(2015, 1, 1),
                        'ResourceUris': [
                            {
                                'ResourceType': 'JAR'|'FILE'|'ARCHIVE',
                                'Uri': 'string'
                            },
                        ],
                        'CatalogId': 'string'
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **UserDefinedFunctions** *(list) --*

             A list of requested function definitions.

             * *(dict) --*

               Represents the equivalent of a Hive user-defined
               function ( "UDF") definition.

               * **FunctionName** *(string) --*

                 The name of the function.

               * **DatabaseName** *(string) --*

                 The name of the catalog database that contains the
                 function.

               * **ClassName** *(string) --*

                 The Java class that contains the function code.

               * **OwnerName** *(string) --*

                 The owner of the function.

               * **OwnerType** *(string) --*

                 The owner type.

               * **CreateTime** *(datetime) --*

                 The time at which the function was created.

               * **ResourceUris** *(list) --*

                 The resource URIs for the function.

                 * *(dict) --*

                   The URIs for function resources.

                   * **ResourceType** *(string) --*

                     The type of the resource.

                   * **Uri** *(string) --*

                     The URI for accessing the resource.

               * **CatalogId** *(string) --*

                 The ID of the Data Catalog in which the function
                 resides.
Glue / Paginator / GetDatabases


GetDatabases
************

class Glue.Paginator.GetDatabases

      paginator = client.get_paginator('get_databases')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_databases()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             CatalogId='string',
             ResourceShareType='FOREIGN'|'ALL'|'FEDERATED',
             AttributesToGet=[
                 'NAME',
             ],
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **CatalogId** (*string*) -- The ID of the Data Catalog from
           which to retrieve "Databases". If none is provided, the
           Amazon Web Services account ID is used by default.

         * **ResourceShareType** (*string*) --

           Allows you to specify that you want to list the databases
           shared with your account. The allowable values are
           "FEDERATED", "FOREIGN" or "ALL".

           * If set to "FEDERATED", will list the federated databases
             (referencing an external entity) shared with your
             account.

           * If set to "FOREIGN", will list the databases shared with
             your account.

           * If set to "ALL", will list the databases shared with your
             account, as well as the databases in yor local account.

         * **AttributesToGet** (*list*) --

           Specifies the database fields returned by the
           "GetDatabases" call. This parameter doesn’t accept an empty
           list. The request must include the "NAME".

           * *(string) --*

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'DatabaseList': [
                    {
                        'Name': 'string',
                        'Description': 'string',
                        'LocationUri': 'string',
                        'Parameters': {
                            'string': 'string'
                        },
                        'CreateTime': datetime(2015, 1, 1),
                        'CreateTableDefaultPermissions': [
                            {
                                'Principal': {
                                    'DataLakePrincipalIdentifier': 'string'
                                },
                                'Permissions': [
                                    'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                                ]
                            },
                        ],
                        'TargetDatabase': {
                            'CatalogId': 'string',
                            'DatabaseName': 'string',
                            'Region': 'string'
                        },
                        'CatalogId': 'string',
                        'FederatedDatabase': {
                            'Identifier': 'string',
                            'ConnectionName': 'string',
                            'ConnectionType': 'string'
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **DatabaseList** *(list) --*

             A list of "Database" objects from the specified catalog.

             * *(dict) --*

               The "Database" object represents a logical grouping of
               tables that might reside in a Hive metastore or an
               RDBMS.

               * **Name** *(string) --*

                 The name of the database. For Hive compatibility,
                 this is folded to lowercase when it is stored.

               * **Description** *(string) --*

                 A description of the database.

               * **LocationUri** *(string) --*

                 The location of the database (for example, an HDFS
                 path).

               * **Parameters** *(dict) --*

                 These key-value pairs define parameters and
                 properties of the database.

                 * *(string) --*

                   * *(string) --*

               * **CreateTime** *(datetime) --*

                 The time at which the metadata database was created
                 in the catalog.

               * **CreateTableDefaultPermissions** *(list) --*

                 Creates a set of default permissions on the table for
                 principals. Used by Lake Formation. Not used in the
                 normal course of Glue operations.

                 * *(dict) --*

                   Permissions granted to a principal.

                   * **Principal** *(dict) --*

                     The principal who is granted permissions.

                     * **DataLakePrincipalIdentifier** *(string) --*

                       An identifier for the Lake Formation principal.

                   * **Permissions** *(list) --*

                     The permissions that are granted to the
                     principal.

                     * *(string) --*

               * **TargetDatabase** *(dict) --*

                 A "DatabaseIdentifier" structure that describes a
                 target database for resource linking.

                 * **CatalogId** *(string) --*

                   The ID of the Data Catalog in which the database
                   resides.

                 * **DatabaseName** *(string) --*

                   The name of the catalog database.

                 * **Region** *(string) --*

                   Region of the target database.

               * **CatalogId** *(string) --*

                 The ID of the Data Catalog in which the database
                 resides.

               * **FederatedDatabase** *(dict) --*

                 A "FederatedDatabase" structure that references an
                 entity outside the Glue Data Catalog.

                 * **Identifier** *(string) --*

                   A unique identifier for the federated database.

                 * **ConnectionName** *(string) --*

                   The name of the connection to the external
                   metastore.

                 * **ConnectionType** *(string) --*

                   The type of connection used to access the federated
                   database, such as JDBC, ODBC, or other supported
                   connection protocols.
Glue / Paginator / DescribeEntity


DescribeEntity
**************

class Glue.Paginator.DescribeEntity

      paginator = client.get_paginator('describe_entity')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.describe_entity()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             ConnectionName='string',
             CatalogId='string',
             EntityName='string',
             DataStoreApiVersion='string',
             PaginationConfig={
                 'MaxItems': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **ConnectionName** (*string*) --

           **[REQUIRED]**

           The name of the connection that contains the connection
           type credentials.

         * **CatalogId** (*string*) -- The catalog ID of the catalog
           that contains the connection. This can be null, By default,
           the Amazon Web Services Account ID is the catalog ID.

         * **EntityName** (*string*) --

           **[REQUIRED]**

           The name of the entity that you want to describe from the
           connection type.

         * **DataStoreApiVersion** (*string*) -- The version of the
           API used for the data store.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Fields': [
                    {
                        'FieldName': 'string',
                        'Label': 'string',
                        'Description': 'string',
                        'FieldType': 'INT'|'SMALLINT'|'BIGINT'|'FLOAT'|'LONG'|'DATE'|'BOOLEAN'|'MAP'|'ARRAY'|'STRING'|'TIMESTAMP'|'DECIMAL'|'BYTE'|'SHORT'|'DOUBLE'|'STRUCT',
                        'IsPrimaryKey': True|False,
                        'IsNullable': True|False,
                        'IsRetrievable': True|False,
                        'IsFilterable': True|False,
                        'IsPartitionable': True|False,
                        'IsCreateable': True|False,
                        'IsUpdateable': True|False,
                        'IsUpsertable': True|False,
                        'IsDefaultOnCreate': True|False,
                        'SupportedValues': [
                            'string',
                        ],
                        'SupportedFilterOperators': [
                            'LESS_THAN'|'GREATER_THAN'|'BETWEEN'|'EQUAL_TO'|'NOT_EQUAL_TO'|'GREATER_THAN_OR_EQUAL_TO'|'LESS_THAN_OR_EQUAL_TO'|'CONTAINS'|'ORDER_BY',
                        ],
                        'ParentField': 'string',
                        'NativeDataType': 'string',
                        'CustomProperties': {
                            'string': 'string'
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Fields** *(list) --*

             Describes the fields for that connector entity. This is
             the list of "Field" objects. "Field" is very similar to
             column in a database. The "Field" object has information
             about different properties associated with fields in the
             connector.

             * *(dict) --*

               The "Field" object has information about the different
               properties associated with a field in the connector.

               * **FieldName** *(string) --*

                 A unique identifier for the field.

               * **Label** *(string) --*

                 A readable label used for the field.

               * **Description** *(string) --*

                 A description of the field.

               * **FieldType** *(string) --*

                 The type of data in the field.

               * **IsPrimaryKey** *(boolean) --*

                 Indicates whether this field can used as a primary
                 key for the given entity.

               * **IsNullable** *(boolean) --*

                 Indicates whether this field can be nullable or not.

               * **IsRetrievable** *(boolean) --*

                 Indicates whether this field can be added in Select
                 clause of SQL query or whether it is retrievable or
                 not.

               * **IsFilterable** *(boolean) --*

                 Indicates whether this field can used in a filter
                 clause ( "WHERE" clause) of a SQL statement when
                 querying data.

               * **IsPartitionable** *(boolean) --*

                 Indicates whether a given field can be used in
                 partitioning the query made to SaaS.

               * **IsCreateable** *(boolean) --*

                 Indicates whether this field can be created as part
                 of a destination write.

               * **IsUpdateable** *(boolean) --*

                 Indicates whether this field can be updated as part
                 of a destination write.

               * **IsUpsertable** *(boolean) --*

                 Indicates whether this field can be upserted as part
                 of a destination write.

               * **IsDefaultOnCreate** *(boolean) --*

                 Indicates whether this field is populated
                 automatically when the object is created, such as a
                 created at timestamp.

               * **SupportedValues** *(list) --*

                 A list of supported values for the field.

                 * *(string) --*

               * **SupportedFilterOperators** *(list) --*

                 Indicates the support filter operators for this
                 field.

                 * *(string) --*

               * **ParentField** *(string) --*

                 A parent field name for a nested field.

               * **NativeDataType** *(string) --*

                 The data type returned by the SaaS API, such as
                 “picklist” or “textarea” from Salesforce.

               * **CustomProperties** *(dict) --*

                 Optional map of keys which may be returned.

                 * *(string) --*

                   * *(string) --*
Glue / Paginator / ListEntities


ListEntities
************

class Glue.Paginator.ListEntities

      paginator = client.get_paginator('list_entities')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_entities()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             ConnectionName='string',
             CatalogId='string',
             ParentEntityName='string',
             DataStoreApiVersion='string',
             PaginationConfig={
                 'MaxItems': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **ConnectionName** (*string*) -- A name for the connection
           that has required credentials to query any connection type.

         * **CatalogId** (*string*) -- The catalog ID of the catalog
           that contains the connection. This can be null, By default,
           the Amazon Web Services Account ID is the catalog ID.

         * **ParentEntityName** (*string*) -- Name of the parent
           entity for which you want to list the children. This
           parameter takes a fully-qualified path of the entity in
           order to list the child entities.

         * **DataStoreApiVersion** (*string*) -- The API version of
           the SaaS connector.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Entities': [
                    {
                        'EntityName': 'string',
                        'Label': 'string',
                        'IsParentEntity': True|False,
                        'Description': 'string',
                        'Category': 'string',
                        'CustomProperties': {
                            'string': 'string'
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Entities** *(list) --*

             A list of "Entity" objects.

             * *(dict) --*

               An entity supported by a given "ConnectionType".

               * **EntityName** *(string) --*

                 The name of the entity.

               * **Label** *(string) --*

                 Label used for the entity.

               * **IsParentEntity** *(boolean) --*

                 A Boolean value which helps to determine whether
                 there are sub objects that can be listed.

               * **Description** *(string) --*

                 A description of the entity.

               * **Category** *(string) --*

                 The type of entities that are present in the
                 response. This value depends on the source
                 connection. For example this is "SObjects" for
                 Salesforce and "databases" or "schemas" or "tables"
                 for sources like Amazon Redshift.

               * **CustomProperties** *(dict) --*

                 An optional map of keys which may be returned for an
                 entity by a connector.

                 * *(string) --*

                   * *(string) --*
Glue / Paginator / GetClassifiers


GetClassifiers
**************

class Glue.Paginator.GetClassifiers

      paginator = client.get_paginator('get_classifiers')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_classifiers()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Classifiers': [
                    {
                        'GrokClassifier': {
                            'Name': 'string',
                            'Classification': 'string',
                            'CreationTime': datetime(2015, 1, 1),
                            'LastUpdated': datetime(2015, 1, 1),
                            'Version': 123,
                            'GrokPattern': 'string',
                            'CustomPatterns': 'string'
                        },
                        'XMLClassifier': {
                            'Name': 'string',
                            'Classification': 'string',
                            'CreationTime': datetime(2015, 1, 1),
                            'LastUpdated': datetime(2015, 1, 1),
                            'Version': 123,
                            'RowTag': 'string'
                        },
                        'JsonClassifier': {
                            'Name': 'string',
                            'CreationTime': datetime(2015, 1, 1),
                            'LastUpdated': datetime(2015, 1, 1),
                            'Version': 123,
                            'JsonPath': 'string'
                        },
                        'CsvClassifier': {
                            'Name': 'string',
                            'CreationTime': datetime(2015, 1, 1),
                            'LastUpdated': datetime(2015, 1, 1),
                            'Version': 123,
                            'Delimiter': 'string',
                            'QuoteSymbol': 'string',
                            'ContainsHeader': 'UNKNOWN'|'PRESENT'|'ABSENT',
                            'Header': [
                                'string',
                            ],
                            'DisableValueTrimming': True|False,
                            'AllowSingleColumn': True|False,
                            'CustomDatatypeConfigured': True|False,
                            'CustomDatatypes': [
                                'string',
                            ],
                            'Serde': 'OpenCSVSerDe'|'LazySimpleSerDe'|'None'
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Classifiers** *(list) --*

             The requested list of classifier objects.

             * *(dict) --*

               Classifiers are triggered during a crawl task. A
               classifier checks whether a given file is in a format
               it can handle. If it is, the classifier creates a
               schema in the form of a "StructType" object that
               matches that data format.

               You can use the standard classifiers that Glue
               provides, or you can write your own classifiers to best
               categorize your data sources and specify the
               appropriate schemas to use for them. A classifier can
               be a "grok" classifier, an "XML" classifier, a "JSON"
               classifier, or a custom "CSV" classifier, as specified
               in one of the fields in the "Classifier" object.

               * **GrokClassifier** *(dict) --*

                 A classifier that uses "grok".

                 * **Name** *(string) --*

                   The name of the classifier.

                 * **Classification** *(string) --*

                   An identifier of the data format that the
                   classifier matches, such as Twitter, JSON, Omniture
                   logs, and so on.

                 * **CreationTime** *(datetime) --*

                   The time that this classifier was registered.

                 * **LastUpdated** *(datetime) --*

                   The time that this classifier was last updated.

                 * **Version** *(integer) --*

                   The version of this classifier.

                 * **GrokPattern** *(string) --*

                   The grok pattern applied to a data store by this
                   classifier. For more information, see built-in
                   patterns in Writing Custom Classifiers.

                 * **CustomPatterns** *(string) --*

                   Optional custom grok patterns defined by this
                   classifier. For more information, see custom
                   patterns in Writing Custom Classifiers.

               * **XMLClassifier** *(dict) --*

                 A classifier for XML content.

                 * **Name** *(string) --*

                   The name of the classifier.

                 * **Classification** *(string) --*

                   An identifier of the data format that the
                   classifier matches.

                 * **CreationTime** *(datetime) --*

                   The time that this classifier was registered.

                 * **LastUpdated** *(datetime) --*

                   The time that this classifier was last updated.

                 * **Version** *(integer) --*

                   The version of this classifier.

                 * **RowTag** *(string) --*

                   The XML tag designating the element that contains
                   each record in an XML document being parsed. This
                   can't identify a self-closing element (closed by
                   "/>"). An empty row element that contains only
                   attributes can be parsed as long as it ends with a
                   closing tag (for example, "<row item_a="A"
                   item_b="B"></row>" is okay, but "<row item_a="A"
                   item_b="B" />" is not).

               * **JsonClassifier** *(dict) --*

                 A classifier for JSON content.

                 * **Name** *(string) --*

                   The name of the classifier.

                 * **CreationTime** *(datetime) --*

                   The time that this classifier was registered.

                 * **LastUpdated** *(datetime) --*

                   The time that this classifier was last updated.

                 * **Version** *(integer) --*

                   The version of this classifier.

                 * **JsonPath** *(string) --*

                   A "JsonPath" string defining the JSON data for the
                   classifier to classify. Glue supports a subset of
                   JsonPath, as described in Writing JsonPath Custom
                   Classifiers.

               * **CsvClassifier** *(dict) --*

                 A classifier for comma-separated values (CSV).

                 * **Name** *(string) --*

                   The name of the classifier.

                 * **CreationTime** *(datetime) --*

                   The time that this classifier was registered.

                 * **LastUpdated** *(datetime) --*

                   The time that this classifier was last updated.

                 * **Version** *(integer) --*

                   The version of this classifier.

                 * **Delimiter** *(string) --*

                   A custom symbol to denote what separates each
                   column entry in the row.

                 * **QuoteSymbol** *(string) --*

                   A custom symbol to denote what combines content
                   into a single column value. It must be different
                   from the column delimiter.

                 * **ContainsHeader** *(string) --*

                   Indicates whether the CSV file contains a header.

                 * **Header** *(list) --*

                   A list of strings representing column names.

                   * *(string) --*

                 * **DisableValueTrimming** *(boolean) --*

                   Specifies not to trim values before identifying the
                   type of column values. The default value is "true".

                 * **AllowSingleColumn** *(boolean) --*

                   Enables the processing of files that contain only
                   one column.

                 * **CustomDatatypeConfigured** *(boolean) --*

                   Enables the custom datatype to be configured.

                 * **CustomDatatypes** *(list) --*

                   A list of custom datatypes including "BINARY",
                   "BOOLEAN", "DATE", "DECIMAL", "DOUBLE", "FLOAT",
                   "INT", "LONG", "SHORT", "STRING", "TIMESTAMP".

                   * *(string) --*

                 * **Serde** *(string) --*

                   Sets the SerDe for processing CSV in the
                   classifier, which will be applied in the Data
                   Catalog. Valid values are "OpenCSVSerDe",
                   "LazySimpleSerDe", and "None". You can specify the
                   "None" value when you want the crawler to do the
                   detection.
Glue / Paginator / ListTriggers


ListTriggers
************

class Glue.Paginator.ListTriggers

      paginator = client.get_paginator('list_triggers')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_triggers()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             DependentJobName='string',
             Tags={
                 'string': 'string'
             },
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **DependentJobName** (*string*) -- The name of the job for
           which to retrieve triggers. The trigger that can start this
           job is returned. If there is no such trigger, all triggers
           are returned.

         * **Tags** (*dict*) --

           Specifies to return only these tagged resources.

           * *(string) --*

             * *(string) --*

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'TriggerNames': [
                    'string',
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **TriggerNames** *(list) --*

             The names of all triggers in the account, or the triggers
             with the specified tags.

             * *(string) --*
Glue / Paginator / ListBlueprints


ListBlueprints
**************

class Glue.Paginator.ListBlueprints

      paginator = client.get_paginator('list_blueprints')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_blueprints()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             Tags={
                 'string': 'string'
             },
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **Tags** (*dict*) --

           Filters the list by an Amazon Web Services resource tag.

           * *(string) --*

             * *(string) --*

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Blueprints': [
                    'string',
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Blueprints** *(list) --*

             List of names of blueprints in the account.

             * *(string) --*
Glue / Paginator / GetJobs


GetJobs
*******

class Glue.Paginator.GetJobs

      paginator = client.get_paginator('get_jobs')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_jobs()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
      **Response Syntax**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

      **Response Structure**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation
Glue / Paginator / ListJobs


ListJobs
********

class Glue.Paginator.ListJobs

      paginator = client.get_paginator('list_jobs')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_jobs()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             Tags={
                 'string': 'string'
             },
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **Tags** (*dict*) --

           Specifies to return only these tagged resources.

           * *(string) --*

             * *(string) --*

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'JobNames': [
                    'string',
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **JobNames** *(list) --*

             The names of all jobs in the account, or the jobs with
             the specified tags.

             * *(string) --*
Glue / Paginator / ListRegistries


ListRegistries
**************

class Glue.Paginator.ListRegistries

      paginator = client.get_paginator('list_registries')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_registries()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Registries': [
                    {
                        'RegistryName': 'string',
                        'RegistryArn': 'string',
                        'Description': 'string',
                        'Status': 'AVAILABLE'|'DELETING',
                        'CreatedTime': 'string',
                        'UpdatedTime': 'string'
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Registries** *(list) --*

             An array of "RegistryDetailedListItem" objects containing
             minimal details of each registry.

             * *(dict) --*

               A structure containing the details for a registry.

               * **RegistryName** *(string) --*

                 The name of the registry.

               * **RegistryArn** *(string) --*

                 The Amazon Resource Name (ARN) of the registry.

               * **Description** *(string) --*

                 A description of the registry.

               * **Status** *(string) --*

                 The status of the registry.

               * **CreatedTime** *(string) --*

                 The data the registry was created.

               * **UpdatedTime** *(string) --*

                 The date the registry was updated.
Glue / Paginator / GetTriggers


GetTriggers
***********

class Glue.Paginator.GetTriggers

      paginator = client.get_paginator('get_triggers')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_triggers()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             DependentJobName='string',
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **DependentJobName** (*string*) -- The name of the job to
           retrieve triggers for. The trigger that can start this job
           is returned, and if there is no such trigger, all triggers
           are returned.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Triggers': [
                    {
                        'Name': 'string',
                        'WorkflowName': 'string',
                        'Id': 'string',
                        'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                        'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                        'Description': 'string',
                        'Schedule': 'string',
                        'Actions': [
                            {
                                'JobName': 'string',
                                'Arguments': {
                                    'string': 'string'
                                },
                                'Timeout': 123,
                                'SecurityConfiguration': 'string',
                                'NotificationProperty': {
                                    'NotifyDelayAfter': 123
                                },
                                'CrawlerName': 'string'
                            },
                        ],
                        'Predicate': {
                            'Logical': 'AND'|'ANY',
                            'Conditions': [
                                {
                                    'LogicalOperator': 'EQUALS',
                                    'JobName': 'string',
                                    'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                    'CrawlerName': 'string',
                                    'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                                },
                            ]
                        },
                        'EventBatchingCondition': {
                            'BatchSize': 123,
                            'BatchWindow': 123
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Triggers** *(list) --*

             A list of triggers for the specified job.

             * *(dict) --*

               Information about a specific trigger.

               * **Name** *(string) --*

                 The name of the trigger.

               * **WorkflowName** *(string) --*

                 The name of the workflow associated with the trigger.

               * **Id** *(string) --*

                 Reserved for future use.

               * **Type** *(string) --*

                 The type of trigger that this is.

               * **State** *(string) --*

                 The current state of the trigger.

               * **Description** *(string) --*

                 A description of this trigger.

               * **Schedule** *(string) --*

                 A "cron" expression used to specify the schedule (see
                 Time-Based Schedules for Jobs and Crawlers. For
                 example, to run something every day at 12:15 UTC, you
                 would specify: "cron(15 12 * * ? *)".

               * **Actions** *(list) --*

                 The actions initiated by this trigger.

                 * *(dict) --*

                   Defines an action to be initiated by a trigger.

                   * **JobName** *(string) --*

                     The name of a job to be run.

                   * **Arguments** *(dict) --*

                     The job arguments used when this trigger fires.
                     For this job run, they replace the default
                     arguments set in the job definition itself.

                     You can specify arguments here that your own job-
                     execution script consumes, as well as arguments
                     that Glue itself consumes.

                     For information about how to specify and consume
                     your own Job arguments, see the Calling Glue APIs
                     in Python topic in the developer guide.

                     For information about the key-value pairs that
                     Glue consumes to set up your job, see the Special
                     Parameters Used by Glue topic in the developer
                     guide.

                     * *(string) --*

                       * *(string) --*

                   * **Timeout** *(integer) --*

                     The "JobRun" timeout in minutes. This is the
                     maximum time that a job run can consume resources
                     before it is terminated and enters "TIMEOUT"
                     status. This overrides the timeout value set in
                     the parent job.

                     Jobs must have timeout values less than 7 days or
                     10080 minutes. Otherwise, the jobs will throw an
                     exception.

                     When the value is left blank, the timeout is
                     defaulted to 2880 minutes.

                     Any existing Glue jobs that had a timeout value
                     greater than 7 days will be defaulted to 7 days.
                     For instance if you have specified a timeout of
                     20 days for a batch job, it will be stopped on
                     the 7th day.

                     For streaming jobs, if you have set up a
                     maintenance window, it will be restarted during
                     the maintenance window after 7 days.

                   * **SecurityConfiguration** *(string) --*

                     The name of the "SecurityConfiguration" structure
                     to be used with this action.

                   * **NotificationProperty** *(dict) --*

                     Specifies configuration properties of a job run
                     notification.

                     * **NotifyDelayAfter** *(integer) --*

                       After a job run starts, the number of minutes
                       to wait before sending a job run delay
                       notification.

                   * **CrawlerName** *(string) --*

                     The name of the crawler to be used with this
                     action.

               * **Predicate** *(dict) --*

                 The predicate of this trigger, which defines when it
                 will fire.

                 * **Logical** *(string) --*

                   An optional field if only one condition is listed.
                   If multiple conditions are listed, then this field
                   is required.

                 * **Conditions** *(list) --*

                   A list of the conditions that determine when the
                   trigger will fire.

                   * *(dict) --*

                     Defines a condition under which a trigger fires.

                     * **LogicalOperator** *(string) --*

                       A logical operator.

                     * **JobName** *(string) --*

                       The name of the job whose "JobRuns" this
                       condition applies to, and on which this trigger
                       waits.

                     * **State** *(string) --*

                       The condition state. Currently, the only job
                       states that a trigger can listen for are
                       "SUCCEEDED", "STOPPED", "FAILED", and
                       "TIMEOUT". The only crawler states that a
                       trigger can listen for are "SUCCEEDED",
                       "FAILED", and "CANCELLED".

                     * **CrawlerName** *(string) --*

                       The name of the crawler to which this condition
                       applies.

                     * **CrawlState** *(string) --*

                       The state of the crawler to which this
                       condition applies.

               * **EventBatchingCondition** *(dict) --*

                 Batch condition that must be met (specified number of
                 events received or batch time window expired) before
                 EventBridge event trigger fires.

                 * **BatchSize** *(integer) --*

                   Number of events that must be received from Amazon
                   EventBridge before EventBridge event trigger fires.

                 * **BatchWindow** *(integer) --*

                   Window of time in seconds after which EventBridge
                   event trigger fires. Window starts when first event
                   is received.
Glue / Paginator / GetSecurityConfigurations


GetSecurityConfigurations
*************************

class Glue.Paginator.GetSecurityConfigurations

      paginator = client.get_paginator('get_security_configurations')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_security_configurations()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'SecurityConfigurations': [
                    {
                        'Name': 'string',
                        'CreatedTimeStamp': datetime(2015, 1, 1),
                        'EncryptionConfiguration': {
                            'S3Encryption': [
                                {
                                    'S3EncryptionMode': 'DISABLED'|'SSE-KMS'|'SSE-S3',
                                    'KmsKeyArn': 'string'
                                },
                            ],
                            'CloudWatchEncryption': {
                                'CloudWatchEncryptionMode': 'DISABLED'|'SSE-KMS',
                                'KmsKeyArn': 'string'
                            },
                            'JobBookmarksEncryption': {
                                'JobBookmarksEncryptionMode': 'DISABLED'|'CSE-KMS',
                                'KmsKeyArn': 'string'
                            },
                            'DataQualityEncryption': {
                                'DataQualityEncryptionMode': 'DISABLED'|'SSE-KMS',
                                'KmsKeyArn': 'string'
                            }
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **SecurityConfigurations** *(list) --*

             A list of security configurations.

             * *(dict) --*

               Specifies a security configuration.

               * **Name** *(string) --*

                 The name of the security configuration.

               * **CreatedTimeStamp** *(datetime) --*

                 The time at which this security configuration was
                 created.

               * **EncryptionConfiguration** *(dict) --*

                 The encryption configuration associated with this
                 security configuration.

                 * **S3Encryption** *(list) --*

                   The encryption configuration for Amazon Simple
                   Storage Service (Amazon S3) data.

                   * *(dict) --*

                     Specifies how Amazon Simple Storage Service
                     (Amazon S3) data should be encrypted.

                     * **S3EncryptionMode** *(string) --*

                       The encryption mode to use for Amazon S3 data.

                     * **KmsKeyArn** *(string) --*

                       The Amazon Resource Name (ARN) of the KMS key
                       to be used to encrypt the data.

                 * **CloudWatchEncryption** *(dict) --*

                   The encryption configuration for Amazon CloudWatch.

                   * **CloudWatchEncryptionMode** *(string) --*

                     The encryption mode to use for CloudWatch data.

                   * **KmsKeyArn** *(string) --*

                     The Amazon Resource Name (ARN) of the KMS key to
                     be used to encrypt the data.

                 * **JobBookmarksEncryption** *(dict) --*

                   The encryption configuration for job bookmarks.

                   * **JobBookmarksEncryptionMode** *(string) --*

                     The encryption mode to use for job bookmarks
                     data.

                   * **KmsKeyArn** *(string) --*

                     The Amazon Resource Name (ARN) of the KMS key to
                     be used to encrypt the data.

                 * **DataQualityEncryption** *(dict) --*

                   The encryption configuration for Glue Data Quality
                   assets.

                   * **DataQualityEncryptionMode** *(string) --*

                     The encryption mode to use for encrypting Data
                     Quality assets. These assets include data quality
                     rulesets, results, statistics, anomaly detection
                     models and observations.

                     Valid values are "SSEKMS" for encryption using a
                     customer-managed KMS key, or "DISABLED".

                   * **KmsKeyArn** *(string) --*

                     The Amazon Resource Name (ARN) of the KMS key to
                     be used to encrypt the data.
Glue / Paginator / GetCrawlers


GetCrawlers
***********

class Glue.Paginator.GetCrawlers

      paginator = client.get_paginator('get_crawlers')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_crawlers()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Crawlers': [
                    {
                        'Name': 'string',
                        'Role': 'string',
                        'Targets': {
                            'S3Targets': [
                                {
                                    'Path': 'string',
                                    'Exclusions': [
                                        'string',
                                    ],
                                    'ConnectionName': 'string',
                                    'SampleSize': 123,
                                    'EventQueueArn': 'string',
                                    'DlqEventQueueArn': 'string'
                                },
                            ],
                            'JdbcTargets': [
                                {
                                    'ConnectionName': 'string',
                                    'Path': 'string',
                                    'Exclusions': [
                                        'string',
                                    ],
                                    'EnableAdditionalMetadata': [
                                        'COMMENTS'|'RAWTYPES',
                                    ]
                                },
                            ],
                            'MongoDBTargets': [
                                {
                                    'ConnectionName': 'string',
                                    'Path': 'string',
                                    'ScanAll': True|False
                                },
                            ],
                            'DynamoDBTargets': [
                                {
                                    'Path': 'string',
                                    'scanAll': True|False,
                                    'scanRate': 123.0
                                },
                            ],
                            'CatalogTargets': [
                                {
                                    'DatabaseName': 'string',
                                    'Tables': [
                                        'string',
                                    ],
                                    'ConnectionName': 'string',
                                    'EventQueueArn': 'string',
                                    'DlqEventQueueArn': 'string'
                                },
                            ],
                            'DeltaTargets': [
                                {
                                    'DeltaTables': [
                                        'string',
                                    ],
                                    'ConnectionName': 'string',
                                    'WriteManifest': True|False,
                                    'CreateNativeDeltaTable': True|False
                                },
                            ],
                            'IcebergTargets': [
                                {
                                    'Paths': [
                                        'string',
                                    ],
                                    'ConnectionName': 'string',
                                    'Exclusions': [
                                        'string',
                                    ],
                                    'MaximumTraversalDepth': 123
                                },
                            ],
                            'HudiTargets': [
                                {
                                    'Paths': [
                                        'string',
                                    ],
                                    'ConnectionName': 'string',
                                    'Exclusions': [
                                        'string',
                                    ],
                                    'MaximumTraversalDepth': 123
                                },
                            ]
                        },
                        'DatabaseName': 'string',
                        'Description': 'string',
                        'Classifiers': [
                            'string',
                        ],
                        'RecrawlPolicy': {
                            'RecrawlBehavior': 'CRAWL_EVERYTHING'|'CRAWL_NEW_FOLDERS_ONLY'|'CRAWL_EVENT_MODE'
                        },
                        'SchemaChangePolicy': {
                            'UpdateBehavior': 'LOG'|'UPDATE_IN_DATABASE',
                            'DeleteBehavior': 'LOG'|'DELETE_FROM_DATABASE'|'DEPRECATE_IN_DATABASE'
                        },
                        'LineageConfiguration': {
                            'CrawlerLineageSettings': 'ENABLE'|'DISABLE'
                        },
                        'State': 'READY'|'RUNNING'|'STOPPING',
                        'TablePrefix': 'string',
                        'Schedule': {
                            'ScheduleExpression': 'string',
                            'State': 'SCHEDULED'|'NOT_SCHEDULED'|'TRANSITIONING'
                        },
                        'CrawlElapsedTime': 123,
                        'CreationTime': datetime(2015, 1, 1),
                        'LastUpdated': datetime(2015, 1, 1),
                        'LastCrawl': {
                            'Status': 'SUCCEEDED'|'CANCELLED'|'FAILED',
                            'ErrorMessage': 'string',
                            'LogGroup': 'string',
                            'LogStream': 'string',
                            'MessagePrefix': 'string',
                            'StartTime': datetime(2015, 1, 1)
                        },
                        'Version': 123,
                        'Configuration': 'string',
                        'CrawlerSecurityConfiguration': 'string',
                        'LakeFormationConfiguration': {
                            'UseLakeFormationCredentials': True|False,
                            'AccountId': 'string'
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Crawlers** *(list) --*

             A list of crawler metadata.

             * *(dict) --*

               Specifies a crawler program that examines a data source
               and uses classifiers to try to determine its schema. If
               successful, the crawler records metadata concerning the
               data source in the Glue Data Catalog.

               * **Name** *(string) --*

                 The name of the crawler.

               * **Role** *(string) --*

                 The Amazon Resource Name (ARN) of an IAM role that's
                 used to access customer resources, such as Amazon
                 Simple Storage Service (Amazon S3) data.

               * **Targets** *(dict) --*

                 A collection of targets to crawl.

                 * **S3Targets** *(list) --*

                   Specifies Amazon Simple Storage Service (Amazon S3)
                   targets.

                   * *(dict) --*

                     Specifies a data store in Amazon Simple Storage
                     Service (Amazon S3).

                     * **Path** *(string) --*

                       The path to the Amazon S3 target.

                     * **Exclusions** *(list) --*

                       A list of glob patterns used to exclude from
                       the crawl. For more information, see Catalog
                       Tables with a Crawler.

                       * *(string) --*

                     * **ConnectionName** *(string) --*

                       The name of a connection which allows a job or
                       crawler to access data in Amazon S3 within an
                       Amazon Virtual Private Cloud environment
                       (Amazon VPC).

                     * **SampleSize** *(integer) --*

                       Sets the number of files in each leaf folder to
                       be crawled when crawling sample files in a
                       dataset. If not set, all the files are crawled.
                       A valid value is an integer between 1 and 249.

                     * **EventQueueArn** *(string) --*

                       A valid Amazon SQS ARN. For example,
                       "arn:aws:sqs:region:account:sqs".

                     * **DlqEventQueueArn** *(string) --*

                       A valid Amazon dead-letter SQS ARN. For
                       example,
                       "arn:aws:sqs:region:account:deadLetterQueue".

                 * **JdbcTargets** *(list) --*

                   Specifies JDBC targets.

                   * *(dict) --*

                     Specifies a JDBC data store to crawl.

                     * **ConnectionName** *(string) --*

                       The name of the connection to use to connect to
                       the JDBC target.

                     * **Path** *(string) --*

                       The path of the JDBC target.

                     * **Exclusions** *(list) --*

                       A list of glob patterns used to exclude from
                       the crawl. For more information, see Catalog
                       Tables with a Crawler.

                       * *(string) --*

                     * **EnableAdditionalMetadata** *(list) --*

                       Specify a value of "RAWTYPES" or "COMMENTS" to
                       enable additional metadata in table responses.
                       "RAWTYPES" provides the native-level datatype.
                       "COMMENTS" provides comments associated with a
                       column or table in the database.

                       If you do not need additional metadata, keep
                       the field empty.

                       * *(string) --*

                 * **MongoDBTargets** *(list) --*

                   Specifies Amazon DocumentDB or MongoDB targets.

                   * *(dict) --*

                     Specifies an Amazon DocumentDB or MongoDB data
                     store to crawl.

                     * **ConnectionName** *(string) --*

                       The name of the connection to use to connect to
                       the Amazon DocumentDB or MongoDB target.

                     * **Path** *(string) --*

                       The path of the Amazon DocumentDB or MongoDB
                       target (database/collection).

                     * **ScanAll** *(boolean) --*

                       Indicates whether to scan all the records, or
                       to sample rows from the table. Scanning all the
                       records can take a long time when the table is
                       not a high throughput table.

                       A value of "true" means to scan all records,
                       while a value of "false" means to sample the
                       records. If no value is specified, the value
                       defaults to "true".

                 * **DynamoDBTargets** *(list) --*

                   Specifies Amazon DynamoDB targets.

                   * *(dict) --*

                     Specifies an Amazon DynamoDB table to crawl.

                     * **Path** *(string) --*

                       The name of the DynamoDB table to crawl.

                     * **scanAll** *(boolean) --*

                       Indicates whether to scan all the records, or
                       to sample rows from the table. Scanning all the
                       records can take a long time when the table is
                       not a high throughput table.

                       A value of "true" means to scan all records,
                       while a value of "false" means to sample the
                       records. If no value is specified, the value
                       defaults to "true".

                     * **scanRate** *(float) --*

                       The percentage of the configured read capacity
                       units to use by the Glue crawler. Read capacity
                       units is a term defined by DynamoDB, and is a
                       numeric value that acts as rate limiter for the
                       number of reads that can be performed on that
                       table per second.

                       The valid values are null or a value between
                       0.1 to 1.5. A null value is used when user does
                       not provide a value, and defaults to 0.5 of the
                       configured Read Capacity Unit (for provisioned
                       tables), or 0.25 of the max configured Read
                       Capacity Unit (for tables using on-demand
                       mode).

                 * **CatalogTargets** *(list) --*

                   Specifies Glue Data Catalog targets.

                   * *(dict) --*

                     Specifies an Glue Data Catalog target.

                     * **DatabaseName** *(string) --*

                       The name of the database to be synchronized.

                     * **Tables** *(list) --*

                       A list of the tables to be synchronized.

                       * *(string) --*

                     * **ConnectionName** *(string) --*

                       The name of the connection for an Amazon
                       S3-backed Data Catalog table to be a target of
                       the crawl when using a "Catalog" connection
                       type paired with a "NETWORK" Connection type.

                     * **EventQueueArn** *(string) --*

                       A valid Amazon SQS ARN. For example,
                       "arn:aws:sqs:region:account:sqs".

                     * **DlqEventQueueArn** *(string) --*

                       A valid Amazon dead-letter SQS ARN. For
                       example,
                       "arn:aws:sqs:region:account:deadLetterQueue".

                 * **DeltaTargets** *(list) --*

                   Specifies Delta data store targets.

                   * *(dict) --*

                     Specifies a Delta data store to crawl one or more
                     Delta tables.

                     * **DeltaTables** *(list) --*

                       A list of the Amazon S3 paths to the Delta
                       tables.

                       * *(string) --*

                     * **ConnectionName** *(string) --*

                       The name of the connection to use to connect to
                       the Delta table target.

                     * **WriteManifest** *(boolean) --*

                       Specifies whether to write the manifest files
                       to the Delta table path.

                     * **CreateNativeDeltaTable** *(boolean) --*

                       Specifies whether the crawler will create
                       native tables, to allow integration with query
                       engines that support querying of the Delta
                       transaction log directly.

                 * **IcebergTargets** *(list) --*

                   Specifies Apache Iceberg data store targets.

                   * *(dict) --*

                     Specifies an Apache Iceberg data source where
                     Iceberg tables are stored in Amazon S3.

                     * **Paths** *(list) --*

                       One or more Amazon S3 paths that contains
                       Iceberg metadata folders as
                       "s3://bucket/prefix".

                       * *(string) --*

                     * **ConnectionName** *(string) --*

                       The name of the connection to use to connect to
                       the Iceberg target.

                     * **Exclusions** *(list) --*

                       A list of glob patterns used to exclude from
                       the crawl. For more information, see Catalog
                       Tables with a Crawler.

                       * *(string) --*

                     * **MaximumTraversalDepth** *(integer) --*

                       The maximum depth of Amazon S3 paths that the
                       crawler can traverse to discover the Iceberg
                       metadata folder in your Amazon S3 path. Used to
                       limit the crawler run time.

                 * **HudiTargets** *(list) --*

                   Specifies Apache Hudi data store targets.

                   * *(dict) --*

                     Specifies an Apache Hudi data source.

                     * **Paths** *(list) --*

                       An array of Amazon S3 location strings for
                       Hudi, each indicating the root folder with
                       which the metadata files for a Hudi table
                       resides. The Hudi folder may be located in a
                       child folder of the root folder.

                       The crawler will scan all folders underneath a
                       path for a Hudi folder.

                       * *(string) --*

                     * **ConnectionName** *(string) --*

                       The name of the connection to use to connect to
                       the Hudi target. If your Hudi files are stored
                       in buckets that require VPC authorization, you
                       can set their connection properties here.

                     * **Exclusions** *(list) --*

                       A list of glob patterns used to exclude from
                       the crawl. For more information, see Catalog
                       Tables with a Crawler.

                       * *(string) --*

                     * **MaximumTraversalDepth** *(integer) --*

                       The maximum depth of Amazon S3 paths that the
                       crawler can traverse to discover the Hudi
                       metadata folder in your Amazon S3 path. Used to
                       limit the crawler run time.

               * **DatabaseName** *(string) --*

                 The name of the database in which the crawler's
                 output is stored.

               * **Description** *(string) --*

                 A description of the crawler.

               * **Classifiers** *(list) --*

                 A list of UTF-8 strings that specify the custom
                 classifiers that are associated with the crawler.

                 * *(string) --*

               * **RecrawlPolicy** *(dict) --*

                 A policy that specifies whether to crawl the entire
                 dataset again, or to crawl only folders that were
                 added since the last crawler run.

                 * **RecrawlBehavior** *(string) --*

                   Specifies whether to crawl the entire dataset again
                   or to crawl only folders that were added since the
                   last crawler run.

                   A value of "CRAWL_EVERYTHING" specifies crawling
                   the entire dataset again.

                   A value of "CRAWL_NEW_FOLDERS_ONLY" specifies
                   crawling only folders that were added since the
                   last crawler run.

                   A value of "CRAWL_EVENT_MODE" specifies crawling
                   only the changes identified by Amazon S3 events.

               * **SchemaChangePolicy** *(dict) --*

                 The policy that specifies update and delete behaviors
                 for the crawler.

                 * **UpdateBehavior** *(string) --*

                   The update behavior when the crawler finds a
                   changed schema.

                 * **DeleteBehavior** *(string) --*

                   The deletion behavior when the crawler finds a
                   deleted object.

               * **LineageConfiguration** *(dict) --*

                 A configuration that specifies whether data lineage
                 is enabled for the crawler.

                 * **CrawlerLineageSettings** *(string) --*

                   Specifies whether data lineage is enabled for the
                   crawler. Valid values are:

                   * ENABLE: enables data lineage for the crawler

                   * DISABLE: disables data lineage for the crawler

               * **State** *(string) --*

                 Indicates whether the crawler is running, or whether
                 a run is pending.

               * **TablePrefix** *(string) --*

                 The prefix added to the names of tables that are
                 created.

               * **Schedule** *(dict) --*

                 For scheduled crawlers, the schedule when the crawler
                 runs.

                 * **ScheduleExpression** *(string) --*

                   A "cron" expression used to specify the schedule
                   (see Time-Based Schedules for Jobs and Crawlers.
                   For example, to run something every day at 12:15
                   UTC, you would specify: "cron(15 12 * * ? *)".

                 * **State** *(string) --*

                   The state of the schedule.

               * **CrawlElapsedTime** *(integer) --*

                 If the crawler is running, contains the total time
                 elapsed since the last crawl began.

               * **CreationTime** *(datetime) --*

                 The time that the crawler was created.

               * **LastUpdated** *(datetime) --*

                 The time that the crawler was last updated.

               * **LastCrawl** *(dict) --*

                 The status of the last crawl, and potentially error
                 information if an error occurred.

                 * **Status** *(string) --*

                   Status of the last crawl.

                 * **ErrorMessage** *(string) --*

                   If an error occurred, the error information about
                   the last crawl.

                 * **LogGroup** *(string) --*

                   The log group for the last crawl.

                 * **LogStream** *(string) --*

                   The log stream for the last crawl.

                 * **MessagePrefix** *(string) --*

                   The prefix for a message about this crawl.

                 * **StartTime** *(datetime) --*

                   The time at which the crawl started.

               * **Version** *(integer) --*

                 The version of the crawler.

               * **Configuration** *(string) --*

                 Crawler configuration information. This versioned
                 JSON string allows users to specify aspects of a
                 crawler's behavior. For more information, see Setting
                 crawler configuration options.

               * **CrawlerSecurityConfiguration** *(string) --*

                 The name of the "SecurityConfiguration" structure to
                 be used by this crawler.

               * **LakeFormationConfiguration** *(dict) --*

                 Specifies whether the crawler should use Lake
                 Formation credentials for the crawler instead of the
                 IAM role credentials.

                 * **UseLakeFormationCredentials** *(boolean) --*

                   Specifies whether to use Lake Formation credentials
                   for the crawler instead of the IAM role
                   credentials.

                 * **AccountId** *(string) --*

                   Required for cross account crawls. For same account
                   crawls as the target data, this can be left as
                   null.
Glue / Paginator / GetPartitionIndexes


GetPartitionIndexes
*******************

class Glue.Paginator.GetPartitionIndexes

      paginator = client.get_paginator('get_partition_indexes')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_partition_indexes()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             CatalogId='string',
             DatabaseName='string',
             TableName='string',
             PaginationConfig={
                 'MaxItems': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **CatalogId** (*string*) -- The catalog ID where the table
           resides.

         * **DatabaseName** (*string*) --

           **[REQUIRED]**

           Specifies the name of a database from which you want to
           retrieve partition indexes.

         * **TableName** (*string*) --

           **[REQUIRED]**

           Specifies the name of a table for which you want to
           retrieve the partition indexes.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'PartitionIndexDescriptorList': [
                    {
                        'IndexName': 'string',
                        'Keys': [
                            {
                                'Name': 'string',
                                'Type': 'string'
                            },
                        ],
                        'IndexStatus': 'CREATING'|'ACTIVE'|'DELETING'|'FAILED',
                        'BackfillErrors': [
                            {
                                'Code': 'ENCRYPTED_PARTITION_ERROR'|'INTERNAL_ERROR'|'INVALID_PARTITION_TYPE_DATA_ERROR'|'MISSING_PARTITION_VALUE_ERROR'|'UNSUPPORTED_PARTITION_CHARACTER_ERROR',
                                'Partitions': [
                                    {
                                        'Values': [
                                            'string',
                                        ]
                                    },
                                ]
                            },
                        ]
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **PartitionIndexDescriptorList** *(list) --*

             A list of index descriptors.

             * *(dict) --*

               A descriptor for a partition index in a table.

               * **IndexName** *(string) --*

                 The name of the partition index.

               * **Keys** *(list) --*

                 A list of one or more keys, as "KeySchemaElement"
                 structures, for the partition index.

                 * *(dict) --*

                   A partition key pair consisting of a name and a
                   type.

                   * **Name** *(string) --*

                     The name of a partition key.

                   * **Type** *(string) --*

                     The type of a partition key.

               * **IndexStatus** *(string) --*

                 The status of the partition index.

                 The possible statuses are:

                 * CREATING: The index is being created. When an index
                   is in a CREATING state, the index or its table
                   cannot be deleted.

                 * ACTIVE: The index creation succeeds.

                 * FAILED: The index creation fails.

                 * DELETING: The index is deleted from the list of
                   indexes.

               * **BackfillErrors** *(list) --*

                 A list of errors that can occur when registering
                 partition indexes for an existing table.

                 * *(dict) --*

                   A list of errors that can occur when registering
                   partition indexes for an existing table.

                   These errors give the details about why an index
                   registration failed and provide a limited number of
                   partitions in the response, so that you can fix the
                   partitions at fault and try registering the index
                   again. The most common set of errors that can occur
                   are categorized as follows:

                   * EncryptedPartitionError: The partitions are
                     encrypted.

                   * InvalidPartitionTypeDataError: The partition
                     value doesn't match the data type for that
                     partition column.

                   * MissingPartitionValueError: The partitions are
                     encrypted.

                   * UnsupportedPartitionCharacterError: Characters
                     inside the partition value are not supported. For
                     example: U+0000 , U+0001, U+0002.

                   * InternalError: Any error which does not belong to
                     other error codes.

                   * **Code** *(string) --*

                     The error code for an error that occurred when
                     registering partition indexes for an existing
                     table.

                   * **Partitions** *(list) --*

                     A list of a limited number of partitions in the
                     response.

                     * *(dict) --*

                       Contains a list of values defining partitions.

                       * **Values** *(list) --*

                         The list of values.

                         * *(string) --*
Glue / Paginator / GetWorkflowRuns


GetWorkflowRuns
***************

class Glue.Paginator.GetWorkflowRuns

      paginator = client.get_paginator('get_workflow_runs')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_workflow_runs()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             Name='string',
             IncludeGraph=True|False,
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **Name** (*string*) --

           **[REQUIRED]**

           Name of the workflow whose metadata of runs should be
           returned.

         * **IncludeGraph** (*boolean*) -- Specifies whether to
           include the workflow graph in response or not.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Runs': [
                    {
                        'Name': 'string',
                        'WorkflowRunId': 'string',
                        'PreviousRunId': 'string',
                        'WorkflowRunProperties': {
                            'string': 'string'
                        },
                        'StartedOn': datetime(2015, 1, 1),
                        'CompletedOn': datetime(2015, 1, 1),
                        'Status': 'RUNNING'|'COMPLETED'|'STOPPING'|'STOPPED'|'ERROR',
                        'ErrorMessage': 'string',
                        'Statistics': {
                            'TotalActions': 123,
                            'TimeoutActions': 123,
                            'FailedActions': 123,
                            'StoppedActions': 123,
                            'SucceededActions': 123,
                            'RunningActions': 123,
                            'ErroredActions': 123,
                            'WaitingActions': 123
                        },
                        'Graph': {
                            'Nodes': [
                                {
                                    'Type': 'CRAWLER'|'JOB'|'TRIGGER',
                                    'Name': 'string',
                                    'UniqueId': 'string',
                                    'TriggerDetails': {
                                        'Trigger': {
                                            'Name': 'string',
                                            'WorkflowName': 'string',
                                            'Id': 'string',
                                            'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                                            'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                                            'Description': 'string',
                                            'Schedule': 'string',
                                            'Actions': [
                                                {
                                                    'JobName': 'string',
                                                    'Arguments': {
                                                        'string': 'string'
                                                    },
                                                    'Timeout': 123,
                                                    'SecurityConfiguration': 'string',
                                                    'NotificationProperty': {
                                                        'NotifyDelayAfter': 123
                                                    },
                                                    'CrawlerName': 'string'
                                                },
                                            ],
                                            'Predicate': {
                                                'Logical': 'AND'|'ANY',
                                                'Conditions': [
                                                    {
                                                        'LogicalOperator': 'EQUALS',
                                                        'JobName': 'string',
                                                        'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                                        'CrawlerName': 'string',
                                                        'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                                                    },
                                                ]
                                            },
                                            'EventBatchingCondition': {
                                                'BatchSize': 123,
                                                'BatchWindow': 123
                                            }
                                        }
                                    },
                                    'JobDetails': {
                                        'JobRuns': [
                                            {
                                                'Id': 'string',
                                                'Attempt': 123,
                                                'PreviousRunId': 'string',
                                                'TriggerName': 'string',
                                                'JobName': 'string',
                                                'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                                                'JobRunQueuingEnabled': True|False,
                                                'StartedOn': datetime(2015, 1, 1),
                                                'LastModifiedOn': datetime(2015, 1, 1),
                                                'CompletedOn': datetime(2015, 1, 1),
                                                'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                                'Arguments': {
                                                    'string': 'string'
                                                },
                                                'ErrorMessage': 'string',
                                                'PredecessorRuns': [
                                                    {
                                                        'JobName': 'string',
                                                        'RunId': 'string'
                                                    },
                                                ],
                                                'AllocatedCapacity': 123,
                                                'ExecutionTime': 123,
                                                'Timeout': 123,
                                                'MaxCapacity': 123.0,
                                                'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                                                'NumberOfWorkers': 123,
                                                'SecurityConfiguration': 'string',
                                                'LogGroupName': 'string',
                                                'NotificationProperty': {
                                                    'NotifyDelayAfter': 123
                                                },
                                                'GlueVersion': 'string',
                                                'DPUSeconds': 123.0,
                                                'ExecutionClass': 'FLEX'|'STANDARD',
                                                'MaintenanceWindow': 'string',
                                                'ProfileName': 'string',
                                                'StateDetail': 'string',
                                                'ExecutionRoleSessionPolicy': 'string'
                                            },
                                        ]
                                    },
                                    'CrawlerDetails': {
                                        'Crawls': [
                                            {
                                                'State': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR',
                                                'StartedOn': datetime(2015, 1, 1),
                                                'CompletedOn': datetime(2015, 1, 1),
                                                'ErrorMessage': 'string',
                                                'LogGroup': 'string',
                                                'LogStream': 'string'
                                            },
                                        ]
                                    }
                                },
                            ],
                            'Edges': [
                                {
                                    'SourceId': 'string',
                                    'DestinationId': 'string'
                                },
                            ]
                        },
                        'StartingEventBatchCondition': {
                            'BatchSize': 123,
                            'BatchWindow': 123
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Runs** *(list) --*

             A list of workflow run metadata objects.

             * *(dict) --*

               A workflow run is an execution of a workflow providing
               all the runtime information.

               * **Name** *(string) --*

                 Name of the workflow that was run.

               * **WorkflowRunId** *(string) --*

                 The ID of this workflow run.

               * **PreviousRunId** *(string) --*

                 The ID of the previous workflow run.

               * **WorkflowRunProperties** *(dict) --*

                 The workflow run properties which were set during the
                 run.

                 * *(string) --*

                   * *(string) --*

               * **StartedOn** *(datetime) --*

                 The date and time when the workflow run was started.

               * **CompletedOn** *(datetime) --*

                 The date and time when the workflow run completed.

               * **Status** *(string) --*

                 The status of the workflow run.

               * **ErrorMessage** *(string) --*

                 This error message describes any error that may have
                 occurred in starting the workflow run. Currently the
                 only error message is "Concurrent runs exceeded for
                 workflow: "foo"."

               * **Statistics** *(dict) --*

                 The statistics of the run.

                 * **TotalActions** *(integer) --*

                   Total number of Actions in the workflow run.

                 * **TimeoutActions** *(integer) --*

                   Total number of Actions that timed out.

                 * **FailedActions** *(integer) --*

                   Total number of Actions that have failed.

                 * **StoppedActions** *(integer) --*

                   Total number of Actions that have stopped.

                 * **SucceededActions** *(integer) --*

                   Total number of Actions that have succeeded.

                 * **RunningActions** *(integer) --*

                   Total number Actions in running state.

                 * **ErroredActions** *(integer) --*

                   Indicates the count of job runs in the ERROR state
                   in the workflow run.

                 * **WaitingActions** *(integer) --*

                   Indicates the count of job runs in WAITING state in
                   the workflow run.

               * **Graph** *(dict) --*

                 The graph representing all the Glue components that
                 belong to the workflow as nodes and directed
                 connections between them as edges.

                 * **Nodes** *(list) --*

                   A list of the the Glue components belong to the
                   workflow represented as nodes.

                   * *(dict) --*

                     A node represents an Glue component (trigger,
                     crawler, or job) on a workflow graph.

                     * **Type** *(string) --*

                       The type of Glue component represented by the
                       node.

                     * **Name** *(string) --*

                       The name of the Glue component represented by
                       the node.

                     * **UniqueId** *(string) --*

                       The unique Id assigned to the node within the
                       workflow.

                     * **TriggerDetails** *(dict) --*

                       Details of the Trigger when the node represents
                       a Trigger.

                       * **Trigger** *(dict) --*

                         The information of the trigger represented by
                         the trigger node.

                         * **Name** *(string) --*

                           The name of the trigger.

                         * **WorkflowName** *(string) --*

                           The name of the workflow associated with
                           the trigger.

                         * **Id** *(string) --*

                           Reserved for future use.

                         * **Type** *(string) --*

                           The type of trigger that this is.

                         * **State** *(string) --*

                           The current state of the trigger.

                         * **Description** *(string) --*

                           A description of this trigger.

                         * **Schedule** *(string) --*

                           A "cron" expression used to specify the
                           schedule (see Time-Based Schedules for Jobs
                           and Crawlers. For example, to run something
                           every day at 12:15 UTC, you would specify:
                           "cron(15 12 * * ? *)".

                         * **Actions** *(list) --*

                           The actions initiated by this trigger.

                           * *(dict) --*

                             Defines an action to be initiated by a
                             trigger.

                             * **JobName** *(string) --*

                               The name of a job to be run.

                             * **Arguments** *(dict) --*

                               The job arguments used when this
                               trigger fires. For this job run, they
                               replace the default arguments set in
                               the job definition itself.

                               You can specify arguments here that
                               your own job-execution script consumes,
                               as well as arguments that Glue itself
                               consumes.

                               For information about how to specify
                               and consume your own Job arguments, see
                               the Calling Glue APIs in Python topic
                               in the developer guide.

                               For information about the key-value
                               pairs that Glue consumes to set up your
                               job, see the Special Parameters Used by
                               Glue topic in the developer guide.

                               * *(string) --*

                                 * *(string) --*

                             * **Timeout** *(integer) --*

                               The "JobRun" timeout in minutes. This
                               is the maximum time that a job run can
                               consume resources before it is
                               terminated and enters "TIMEOUT" status.
                               This overrides the timeout value set in
                               the parent job.

                               Jobs must have timeout values less than
                               7 days or 10080 minutes. Otherwise, the
                               jobs will throw an exception.

                               When the value is left blank, the
                               timeout is defaulted to 2880 minutes.

                               Any existing Glue jobs that had a
                               timeout value greater than 7 days will
                               be defaulted to 7 days. For instance if
                               you have specified a timeout of 20 days
                               for a batch job, it will be stopped on
                               the 7th day.

                               For streaming jobs, if you have set up
                               a maintenance window, it will be
                               restarted during the maintenance window
                               after 7 days.

                             * **SecurityConfiguration** *(string) --*

                               The name of the "SecurityConfiguration"
                               structure to be used with this action.

                             * **NotificationProperty** *(dict) --*

                               Specifies configuration properties of a
                               job run notification.

                               * **NotifyDelayAfter** *(integer) --*

                                 After a job run starts, the number of
                                 minutes to wait before sending a job
                                 run delay notification.

                             * **CrawlerName** *(string) --*

                               The name of the crawler to be used with
                               this action.

                         * **Predicate** *(dict) --*

                           The predicate of this trigger, which
                           defines when it will fire.

                           * **Logical** *(string) --*

                             An optional field if only one condition
                             is listed. If multiple conditions are
                             listed, then this field is required.

                           * **Conditions** *(list) --*

                             A list of the conditions that determine
                             when the trigger will fire.

                             * *(dict) --*

                               Defines a condition under which a
                               trigger fires.

                               * **LogicalOperator** *(string) --*

                                 A logical operator.

                               * **JobName** *(string) --*

                                 The name of the job whose "JobRuns"
                                 this condition applies to, and on
                                 which this trigger waits.

                               * **State** *(string) --*

                                 The condition state. Currently, the
                                 only job states that a trigger can
                                 listen for are "SUCCEEDED",
                                 "STOPPED", "FAILED", and "TIMEOUT".
                                 The only crawler states that a
                                 trigger can listen for are
                                 "SUCCEEDED", "FAILED", and
                                 "CANCELLED".

                               * **CrawlerName** *(string) --*

                                 The name of the crawler to which this
                                 condition applies.

                               * **CrawlState** *(string) --*

                                 The state of the crawler to which
                                 this condition applies.

                         * **EventBatchingCondition** *(dict) --*

                           Batch condition that must be met (specified
                           number of events received or batch time
                           window expired) before EventBridge event
                           trigger fires.

                           * **BatchSize** *(integer) --*

                             Number of events that must be received
                             from Amazon EventBridge before
                             EventBridge event trigger fires.

                           * **BatchWindow** *(integer) --*

                             Window of time in seconds after which
                             EventBridge event trigger fires. Window
                             starts when first event is received.

                     * **JobDetails** *(dict) --*

                       Details of the Job when the node represents a
                       Job.

                       * **JobRuns** *(list) --*

                         The information for the job runs represented
                         by the job node.

                         * *(dict) --*

                           Contains information about a job run.

                           * **Id** *(string) --*

                             The ID of this job run.

                           * **Attempt** *(integer) --*

                             The number of the attempt to run this
                             job.

                           * **PreviousRunId** *(string) --*

                             The ID of the previous run of this job.
                             For example, the "JobRunId" specified in
                             the "StartJobRun" action.

                           * **TriggerName** *(string) --*

                             The name of the trigger that started this
                             job run.

                           * **JobName** *(string) --*

                             The name of the job definition being used
                             in this run.

                           * **JobMode** *(string) --*

                             A mode that describes how a job was
                             created. Valid values are:

                             * "SCRIPT" - The job was created using
                               the Glue Studio script editor.

                             * "VISUAL" - The job was created using
                               the Glue Studio visual editor.

                             * "NOTEBOOK" - The job was created using
                               an interactive sessions notebook.

                             When the "JobMode" field is missing or
                             null, "SCRIPT" is assigned as the default
                             value.

                           * **JobRunQueuingEnabled** *(boolean) --*

                             Specifies whether job run queuing is
                             enabled for the job run.

                             A value of true means job run queuing is
                             enabled for the job run. If false or not
                             populated, the job run will not be
                             considered for queueing.

                           * **StartedOn** *(datetime) --*

                             The date and time at which this job run
                             was started.

                           * **LastModifiedOn** *(datetime) --*

                             The last time that this job run was
                             modified.

                           * **CompletedOn** *(datetime) --*

                             The date and time that this job run
                             completed.

                           * **JobRunState** *(string) --*

                             The current state of the job run. For
                             more information about the statuses of
                             jobs that have terminated abnormally, see
                             Glue Job Run Statuses.

                           * **Arguments** *(dict) --*

                             The job arguments associated with this
                             run. For this job run, they replace the
                             default arguments set in the job
                             definition itself.

                             You can specify arguments here that your
                             own job-execution script consumes, as
                             well as arguments that Glue itself
                             consumes.

                             Job arguments may be logged. Do not pass
                             plaintext secrets as arguments. Retrieve
                             secrets from a Glue Connection, Secrets
                             Manager or other secret management
                             mechanism if you intend to keep them
                             within the Job.

                             For information about how to specify and
                             consume your own Job arguments, see the
                             Calling Glue APIs in Python topic in the
                             developer guide.

                             For information about the arguments you
                             can provide to this field when
                             configuring Spark jobs, see the Special
                             Parameters Used by Glue topic in the
                             developer guide.

                             For information about the arguments you
                             can provide to this field when
                             configuring Ray jobs, see Using job
                             parameters in Ray jobs in the developer
                             guide.

                             * *(string) --*

                               * *(string) --*

                           * **ErrorMessage** *(string) --*

                             An error message associated with this job
                             run.

                           * **PredecessorRuns** *(list) --*

                             A list of predecessors to this job run.

                             * *(dict) --*

                               A job run that was used in the
                               predicate of a conditional trigger that
                               triggered this job run.

                               * **JobName** *(string) --*

                                 The name of the job definition used
                                 by the predecessor job run.

                               * **RunId** *(string) --*

                                 The job-run ID of the predecessor job
                                 run.

                           * **AllocatedCapacity** *(integer) --*

                             This field is deprecated. Use
                             "MaxCapacity" instead.

                             The number of Glue data processing units
                             (DPUs) allocated to this JobRun. From 2
                             to 100 DPUs can be allocated; the default
                             is 10. A DPU is a relative measure of
                             processing power that consists of 4 vCPUs
                             of compute capacity and 16 GB of memory.
                             For more information, see the Glue
                             pricing page.

                           * **ExecutionTime** *(integer) --*

                             The amount of time (in seconds) that the
                             job run consumed resources.

                           * **Timeout** *(integer) --*

                             The "JobRun" timeout in minutes. This is
                             the maximum time that a job run can
                             consume resources before it is terminated
                             and enters "TIMEOUT" status. This value
                             overrides the timeout value set in the
                             parent job.

                             Jobs must have timeout values less than 7
                             days or 10080 minutes. Otherwise, the
                             jobs will throw an exception.

                             When the value is left blank, the timeout
                             is defaulted to 2880 minutes.

                             Any existing Glue jobs that had a timeout
                             value greater than 7 days will be
                             defaulted to 7 days. For instance if you
                             have specified a timeout of 20 days for a
                             batch job, it will be stopped on the 7th
                             day.

                             For streaming jobs, if you have set up a
                             maintenance window, it will be restarted
                             during the maintenance window after 7
                             days.

                           * **MaxCapacity** *(float) --*

                             For Glue version 1.0 or earlier jobs,
                             using the standard worker type, the
                             number of Glue data processing units
                             (DPUs) that can be allocated when this
                             job runs. A DPU is a relative measure of
                             processing power that consists of 4 vCPUs
                             of compute capacity and 16 GB of memory.
                             For more information, see the Glue
                             pricing page.

                             For Glue version 2.0+ jobs, you cannot
                             specify a "Maximum capacity". Instead,
                             you should specify a "Worker type" and
                             the "Number of workers".

                             Do not set "MaxCapacity" if using
                             "WorkerType" and "NumberOfWorkers".

                             The value that can be allocated for
                             "MaxCapacity" depends on whether you are
                             running a Python shell job, an Apache
                             Spark ETL job, or an Apache Spark
                             streaming ETL job:

                             * When you specify a Python shell job (
                               >>``<<JobCommand.Name``="pythonshell"),
                               you can allocate either 0.0625 or 1
                               DPU. The default is 0.0625 DPU.

                             * When you specify an Apache Spark ETL
                               job (
                               >>``<<JobCommand.Name``="glueetl") or
                               Apache Spark streaming ETL job ( >>``<
                               <JobCommand.Name``="gluestreaming"),
                               you can allocate from 2 to 100 DPUs.
                               The default is 10 DPUs. This job type
                               cannot have a fractional DPU
                               allocation.

                           * **WorkerType** *(string) --*

                             The type of predefined worker that is
                             allocated when a job runs. Accepts a
                             value of G.1X, G.2X, G.4X, G.8X or G.025X
                             for Spark jobs. Accepts the value Z.2X
                             for Ray jobs.

                             * For the "G.1X" worker type, each worker
                               maps to 1 DPU (4 vCPUs, 16 GB of
                               memory) with 94GB disk, and provides 1
                               executor per worker. We recommend this
                               worker type for workloads such as data
                               transforms, joins, and queries, to
                               offers a scalable and cost effective
                               way to run most jobs.

                             * For the "G.2X" worker type, each worker
                               maps to 2 DPU (8 vCPUs, 32 GB of
                               memory) with 138GB disk, and provides 1
                               executor per worker. We recommend this
                               worker type for workloads such as data
                               transforms, joins, and queries, to
                               offers a scalable and cost effective
                               way to run most jobs.

                             * For the "G.4X" worker type, each worker
                               maps to 4 DPU (16 vCPUs, 64 GB of
                               memory) with 256GB disk, and provides 1
                               executor per worker. We recommend this
                               worker type for jobs whose workloads
                               contain your most demanding transforms,
                               aggregations, joins, and queries. This
                               worker type is available only for Glue
                               version 3.0 or later Spark ETL jobs in
                               the following Amazon Web Services
                               Regions: US East (Ohio), US East (N.
                               Virginia), US West (Oregon), Asia
                               Pacific (Singapore), Asia Pacific
                               (Sydney), Asia Pacific (Tokyo), Canada
                               (Central), Europe (Frankfurt), Europe
                               (Ireland), and Europe (Stockholm).

                             * For the "G.8X" worker type, each worker
                               maps to 8 DPU (32 vCPUs, 128 GB of
                               memory) with 512GB disk, and provides 1
                               executor per worker. We recommend this
                               worker type for jobs whose workloads
                               contain your most demanding transforms,
                               aggregations, joins, and queries. This
                               worker type is available only for Glue
                               version 3.0 or later Spark ETL jobs, in
                               the same Amazon Web Services Regions as
                               supported for the "G.4X" worker type.

                             * For the "G.025X" worker type, each
                               worker maps to 0.25 DPU (2 vCPUs, 4 GB
                               of memory) with 84GB disk, and provides
                               1 executor per worker. We recommend
                               this worker type for low volume
                               streaming jobs. This worker type is
                               only available for Glue version 3.0 or
                               later streaming jobs.

                             * For the "Z.2X" worker type, each worker
                               maps to 2 M-DPU (8vCPUs, 64 GB of
                               memory) with 128 GB disk, and provides
                               up to 8 Ray workers based on the
                               autoscaler.

                           * **NumberOfWorkers** *(integer) --*

                             The number of workers of a defined
                             "workerType" that are allocated when a
                             job runs.

                           * **SecurityConfiguration** *(string) --*

                             The name of the "SecurityConfiguration"
                             structure to be used with this job run.

                           * **LogGroupName** *(string) --*

                             The name of the log group for secure
                             logging that can be server-side encrypted
                             in Amazon CloudWatch using KMS. This name
                             can be "/aws-glue/jobs/", in which case
                             the default encryption is "NONE". If you
                             add a role name and
                             "SecurityConfiguration" name (in other
                             words, "/aws-glue/jobs-yourRoleName-
                             yourSecurityConfigurationName/"), then
                             that security configuration is used to
                             encrypt the log group.

                           * **NotificationProperty** *(dict) --*

                             Specifies configuration properties of a
                             job run notification.

                             * **NotifyDelayAfter** *(integer) --*

                               After a job run starts, the number of
                               minutes to wait before sending a job
                               run delay notification.

                           * **GlueVersion** *(string) --*

                             In Spark jobs, "GlueVersion" determines
                             the versions of Apache Spark and Python
                             that Glue available in a job. The Python
                             version indicates the version supported
                             for jobs of type Spark.

                             Ray jobs should set "GlueVersion" to
                             "4.0" or greater. However, the versions
                             of Ray, Python and additional libraries
                             available in your Ray job are determined
                             by the "Runtime" parameter of the Job
                             command.

                             For more information about the available
                             Glue versions and corresponding Spark and
                             Python versions, see Glue version in the
                             developer guide.

                             Jobs that are created without specifying
                             a Glue version default to Glue 0.9.

                           * **DPUSeconds** *(float) --*

                             This field can be set for either job runs
                             with execution class "FLEX" or when Auto
                             Scaling is enabled, and represents the
                             total time each executor ran during the
                             lifecycle of a job run in seconds,
                             multiplied by a DPU factor (1 for "G.1X",
                             2 for "G.2X", or 0.25 for "G.025X"
                             workers). This value may be different
                             than the "executionEngineRuntime" *
                             "MaxCapacity" as in the case of Auto
                             Scaling jobs, as the number of executors
                             running at a given time may be less than
                             the "MaxCapacity". Therefore, it is
                             possible that the value of "DPUSeconds"
                             is less than "executionEngineRuntime" *
                             "MaxCapacity".

                           * **ExecutionClass** *(string) --*

                             Indicates whether the job is run with a
                             standard or flexible execution class. The
                             standard execution-class is ideal for
                             time-sensitive workloads that require
                             fast job startup and dedicated resources.

                             The flexible execution class is
                             appropriate for time-insensitive jobs
                             whose start and completion times may
                             vary.

                             Only jobs with Glue version 3.0 and above
                             and command type "glueetl" will be
                             allowed to set "ExecutionClass" to
                             "FLEX". The flexible execution class is
                             available for Spark jobs.

                           * **MaintenanceWindow** *(string) --*

                             This field specifies a day of the week
                             and hour for a maintenance window for
                             streaming jobs. Glue periodically
                             performs maintenance activities. During
                             these maintenance windows, Glue will need
                             to restart your streaming jobs.

                             Glue will restart the job within 3 hours
                             of the specified maintenance window. For
                             instance, if you set up the maintenance
                             window for Monday at 10:00AM GMT, your
                             jobs will be restarted between 10:00AM
                             GMT to 1:00PM GMT.

                           * **ProfileName** *(string) --*

                             The name of an Glue usage profile
                             associated with the job run.

                           * **StateDetail** *(string) --*

                             This field holds details that pertain to
                             the state of a job run. The field is
                             nullable.

                             For example, when a job run is in a
                             WAITING state as a result of job run
                             queuing, the field has the reason why the
                             job run is in that state.

                           * **ExecutionRoleSessionPolicy** *(string)
                             --*

                             This inline session policy to the
                             StartJobRun API allows you to dynamically
                             restrict the permissions of the specified
                             execution role for the scope of the job,
                             without requiring the creation of
                             additional IAM roles.

                     * **CrawlerDetails** *(dict) --*

                       Details of the crawler when the node represents
                       a crawler.

                       * **Crawls** *(list) --*

                         A list of crawls represented by the crawl
                         node.

                         * *(dict) --*

                           The details of a crawl in the workflow.

                           * **State** *(string) --*

                             The state of the crawler.

                           * **StartedOn** *(datetime) --*

                             The date and time on which the crawl
                             started.

                           * **CompletedOn** *(datetime) --*

                             The date and time on which the crawl
                             completed.

                           * **ErrorMessage** *(string) --*

                             The error message associated with the
                             crawl.

                           * **LogGroup** *(string) --*

                             The log group associated with the crawl.

                           * **LogStream** *(string) --*

                             The log stream associated with the crawl.

                 * **Edges** *(list) --*

                   A list of all the directed connections between the
                   nodes belonging to the workflow.

                   * *(dict) --*

                     An edge represents a directed connection between
                     two Glue components that are part of the workflow
                     the edge belongs to.

                     * **SourceId** *(string) --*

                       The unique of the node within the workflow
                       where the edge starts.

                     * **DestinationId** *(string) --*

                       The unique of the node within the workflow
                       where the edge ends.

               * **StartingEventBatchCondition** *(dict) --*

                 The batch condition that started the workflow run.

                 * **BatchSize** *(integer) --*

                   Number of events in the batch.

                 * **BatchWindow** *(integer) --*

                   Duration of the batch window in seconds.
Glue / Paginator / GetCrawlerMetrics


GetCrawlerMetrics
*****************

class Glue.Paginator.GetCrawlerMetrics

      paginator = client.get_paginator('get_crawler_metrics')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_crawler_metrics()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             CrawlerNameList=[
                 'string',
             ],
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **CrawlerNameList** (*list*) --

           A list of the names of crawlers about which to retrieve
           metrics.

           * *(string) --*

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'CrawlerMetricsList': [
                    {
                        'CrawlerName': 'string',
                        'TimeLeftSeconds': 123.0,
                        'StillEstimating': True|False,
                        'LastRuntimeSeconds': 123.0,
                        'MedianRuntimeSeconds': 123.0,
                        'TablesCreated': 123,
                        'TablesUpdated': 123,
                        'TablesDeleted': 123
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **CrawlerMetricsList** *(list) --*

             A list of metrics for the specified crawler.

             * *(dict) --*

               Metrics for a specified crawler.

               * **CrawlerName** *(string) --*

                 The name of the crawler.

               * **TimeLeftSeconds** *(float) --*

                 The estimated time left to complete a running crawl.

               * **StillEstimating** *(boolean) --*

                 True if the crawler is still estimating how long it
                 will take to complete this run.

               * **LastRuntimeSeconds** *(float) --*

                 The duration of the crawler's most recent run, in
                 seconds.

               * **MedianRuntimeSeconds** *(float) --*

                 The median duration of this crawler's runs, in
                 seconds.

               * **TablesCreated** *(integer) --*

                 The number of tables created by this crawler.

               * **TablesUpdated** *(integer) --*

                 The number of tables updated by this crawler.

               * **TablesDeleted** *(integer) --*

                 The number of tables deleted by this crawler.
Glue / Paginator / ListSchemas


ListSchemas
***********

class Glue.Paginator.ListSchemas

      paginator = client.get_paginator('list_schemas')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.list_schemas()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             RegistryId={
                 'RegistryName': 'string',
                 'RegistryArn': 'string'
             },
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **RegistryId** (*dict*) --

           A wrapper structure that may contain the registry name and
           Amazon Resource Name (ARN).

           * **RegistryName** *(string) --*

             Name of the registry. Used only for lookup. One of
             "RegistryArn" or "RegistryName" has to be provided.

           * **RegistryArn** *(string) --*

             Arn of the registry to be updated. One of "RegistryArn"
             or "RegistryName" has to be provided.

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'Schemas': [
                    {
                        'RegistryName': 'string',
                        'SchemaName': 'string',
                        'SchemaArn': 'string',
                        'Description': 'string',
                        'SchemaStatus': 'AVAILABLE'|'PENDING'|'DELETING',
                        'CreatedTime': 'string',
                        'UpdatedTime': 'string'
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **Schemas** *(list) --*

             An array of "SchemaListItem" objects containing details
             of each schema.

             * *(dict) --*

               An object that contains minimal details for a schema.

               * **RegistryName** *(string) --*

                 the name of the registry where the schema resides.

               * **SchemaName** *(string) --*

                 The name of the schema.

               * **SchemaArn** *(string) --*

                 The Amazon Resource Name (ARN) for the schema.

               * **Description** *(string) --*

                 A description for the schema.

               * **SchemaStatus** *(string) --*

                 The status of the schema.

               * **CreatedTime** *(string) --*

                 The date and time that a schema was created.

               * **UpdatedTime** *(string) --*

                 The date and time that a schema was updated.
Glue / Paginator / GetResourcePolicies


GetResourcePolicies
*******************

class Glue.Paginator.GetResourcePolicies

      paginator = client.get_paginator('get_resource_policies')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_resource_policies()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         **PaginationConfig** (*dict*) --

         A dictionary that provides parameters to control pagination.

         * **MaxItems** *(integer) --*

           The total number of items to return. If the total number of
           items available is more than the value specified in max-
           items then a "NextToken" will be provided in the output
           that you can use to resume pagination.

         * **PageSize** *(integer) --*

           The size of each page.

         * **StartingToken** *(string) --*

           A token to specify where to start paginating. This is the
           "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'GetResourcePoliciesResponseList': [
                    {
                        'PolicyInJson': 'string',
                        'PolicyHash': 'string',
                        'CreateTime': datetime(2015, 1, 1),
                        'UpdateTime': datetime(2015, 1, 1)
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **GetResourcePoliciesResponseList** *(list) --*

             A list of the individual resource policies and the
             account-level resource policy.

             * *(dict) --*

               A structure for returning a resource policy.

               * **PolicyInJson** *(string) --*

                 Contains the requested policy document, in JSON
                 format.

               * **PolicyHash** *(string) --*

                 Contains the hash value associated with this policy.

               * **CreateTime** *(datetime) --*

                 The date and time at which the policy was created.

               * **UpdateTime** *(datetime) --*

                 The date and time at which the policy was last
                 updated.
Glue / Paginator / GetTables


GetTables
*********

class Glue.Paginator.GetTables

      paginator = client.get_paginator('get_tables')

   paginate(**kwargs)

      Creates an iterator that will paginate through responses from
      "Glue.Client.get_tables()".

      See also: AWS API Documentation

      **Request Syntax**

         response_iterator = paginator.paginate(
             CatalogId='string',
             DatabaseName='string',
             Expression='string',
             TransactionId='string',
             QueryAsOfTime=datetime(2015, 1, 1),
             IncludeStatusDetails=True|False,
             AttributesToGet=[
                 'NAME'|'TABLE_TYPE',
             ],
             PaginationConfig={
                 'MaxItems': 123,
                 'PageSize': 123,
                 'StartingToken': 'string'
             }
         )

      Parameters:
         * **CatalogId** (*string*) -- The ID of the Data Catalog
           where the tables reside. If none is provided, the Amazon
           Web Services account ID is used by default.

         * **DatabaseName** (*string*) --

           **[REQUIRED]**

           The database in the catalog whose tables to list. For Hive
           compatibility, this name is entirely lowercase.

         * **Expression** (*string*) -- A regular expression pattern.
           If present, only those tables whose names match the pattern
           are returned.

         * **TransactionId** (*string*) -- The transaction ID at which
           to read the table contents.

         * **QueryAsOfTime** (*datetime*) -- The time as of when to
           read the table contents. If not set, the most recent
           transaction commit time will be used. Cannot be specified
           along with "TransactionId".

         * **IncludeStatusDetails** (*boolean*) -- Specifies whether
           to include status details related to a request to create or
           update an Glue Data Catalog view.

         * **AttributesToGet** (*list*) --

           Specifies the table fields returned by the "GetTables"
           call. This parameter doesn’t accept an empty list. The
           request must include "NAME".

           The following are the valid combinations of values:

           * "NAME" - Names of all tables in the database.

           * "NAME", "TABLE_TYPE" - Names of all tables and the table
             types.

           * *(string) --*

         * **PaginationConfig** (*dict*) --

           A dictionary that provides parameters to control
           pagination.

           * **MaxItems** *(integer) --*

             The total number of items to return. If the total number
             of items available is more than the value specified in
             max-items then a "NextToken" will be provided in the
             output that you can use to resume pagination.

           * **PageSize** *(integer) --*

             The size of each page.

           * **StartingToken** *(string) --*

             A token to specify where to start paginating. This is the
             "NextToken" from a previous response.

      Return type:
         dict

      Returns:
         **Response Syntax**

            {
                'TableList': [
                    {
                        'Name': 'string',
                        'DatabaseName': 'string',
                        'Description': 'string',
                        'Owner': 'string',
                        'CreateTime': datetime(2015, 1, 1),
                        'UpdateTime': datetime(2015, 1, 1),
                        'LastAccessTime': datetime(2015, 1, 1),
                        'LastAnalyzedTime': datetime(2015, 1, 1),
                        'Retention': 123,
                        'StorageDescriptor': {
                            'Columns': [
                                {
                                    'Name': 'string',
                                    'Type': 'string',
                                    'Comment': 'string',
                                    'Parameters': {
                                        'string': 'string'
                                    }
                                },
                            ],
                            'Location': 'string',
                            'AdditionalLocations': [
                                'string',
                            ],
                            'InputFormat': 'string',
                            'OutputFormat': 'string',
                            'Compressed': True|False,
                            'NumberOfBuckets': 123,
                            'SerdeInfo': {
                                'Name': 'string',
                                'SerializationLibrary': 'string',
                                'Parameters': {
                                    'string': 'string'
                                }
                            },
                            'BucketColumns': [
                                'string',
                            ],
                            'SortColumns': [
                                {
                                    'Column': 'string',
                                    'SortOrder': 123
                                },
                            ],
                            'Parameters': {
                                'string': 'string'
                            },
                            'SkewedInfo': {
                                'SkewedColumnNames': [
                                    'string',
                                ],
                                'SkewedColumnValues': [
                                    'string',
                                ],
                                'SkewedColumnValueLocationMaps': {
                                    'string': 'string'
                                }
                            },
                            'StoredAsSubDirectories': True|False,
                            'SchemaReference': {
                                'SchemaId': {
                                    'SchemaArn': 'string',
                                    'SchemaName': 'string',
                                    'RegistryName': 'string'
                                },
                                'SchemaVersionId': 'string',
                                'SchemaVersionNumber': 123
                            }
                        },
                        'PartitionKeys': [
                            {
                                'Name': 'string',
                                'Type': 'string',
                                'Comment': 'string',
                                'Parameters': {
                                    'string': 'string'
                                }
                            },
                        ],
                        'ViewOriginalText': 'string',
                        'ViewExpandedText': 'string',
                        'TableType': 'string',
                        'Parameters': {
                            'string': 'string'
                        },
                        'CreatedBy': 'string',
                        'IsRegisteredWithLakeFormation': True|False,
                        'TargetTable': {
                            'CatalogId': 'string',
                            'DatabaseName': 'string',
                            'Name': 'string',
                            'Region': 'string'
                        },
                        'CatalogId': 'string',
                        'VersionId': 'string',
                        'FederatedTable': {
                            'Identifier': 'string',
                            'DatabaseIdentifier': 'string',
                            'ConnectionName': 'string',
                            'ConnectionType': 'string'
                        },
                        'ViewDefinition': {
                            'IsProtected': True|False,
                            'Definer': 'string',
                            'SubObjects': [
                                'string',
                            ],
                            'Representations': [
                                {
                                    'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                    'DialectVersion': 'string',
                                    'ViewOriginalText': 'string',
                                    'ViewExpandedText': 'string',
                                    'ValidationConnection': 'string',
                                    'IsStale': True|False
                                },
                            ]
                        },
                        'IsMultiDialectView': True|False,
                        'Status': {
                            'RequestedBy': 'string',
                            'UpdatedBy': 'string',
                            'RequestTime': datetime(2015, 1, 1),
                            'UpdateTime': datetime(2015, 1, 1),
                            'Action': 'UPDATE'|'CREATE',
                            'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                            'Error': {
                                'ErrorCode': 'string',
                                'ErrorMessage': 'string'
                            },
                            'Details': {
                                'RequestedChange': {'... recursive ...'},
                                'ViewValidations': [
                                    {
                                        'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                        'DialectVersion': 'string',
                                        'ViewValidationText': 'string',
                                        'UpdateTime': datetime(2015, 1, 1),
                                        'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                                        'Error': {
                                            'ErrorCode': 'string',
                                            'ErrorMessage': 'string'
                                        }
                                    },
                                ]
                            }
                        }
                    },
                ],

            }

         **Response Structure**

         * *(dict) --*

           * **TableList** *(list) --*

             A list of the requested "Table" objects.

             * *(dict) --*

               Represents a collection of related data organized in
               columns and rows.

               * **Name** *(string) --*

                 The table name. For Hive compatibility, this must be
                 entirely lowercase.

               * **DatabaseName** *(string) --*

                 The name of the database where the table metadata
                 resides. For Hive compatibility, this must be all
                 lowercase.

               * **Description** *(string) --*

                 A description of the table.

               * **Owner** *(string) --*

                 The owner of the table.

               * **CreateTime** *(datetime) --*

                 The time when the table definition was created in the
                 Data Catalog.

               * **UpdateTime** *(datetime) --*

                 The last time that the table was updated.

               * **LastAccessTime** *(datetime) --*

                 The last time that the table was accessed. This is
                 usually taken from HDFS, and might not be reliable.

               * **LastAnalyzedTime** *(datetime) --*

                 The last time that column statistics were computed
                 for this table.

               * **Retention** *(integer) --*

                 The retention time for this table.

               * **StorageDescriptor** *(dict) --*

                 A storage descriptor containing information about the
                 physical storage of this table.

                 * **Columns** *(list) --*

                   A list of the "Columns" in the table.

                   * *(dict) --*

                     A column in a "Table".

                     * **Name** *(string) --*

                       The name of the "Column".

                     * **Type** *(string) --*

                       The data type of the "Column".

                     * **Comment** *(string) --*

                       A free-form text comment.

                     * **Parameters** *(dict) --*

                       These key-value pairs define properties
                       associated with the column.

                       * *(string) --*

                         * *(string) --*

                 * **Location** *(string) --*

                   The physical location of the table. By default,
                   this takes the form of the warehouse location,
                   followed by the database location in the warehouse,
                   followed by the table name.

                 * **AdditionalLocations** *(list) --*

                   A list of locations that point to the path where a
                   Delta table is located.

                   * *(string) --*

                 * **InputFormat** *(string) --*

                   The input format: "SequenceFileInputFormat"
                   (binary), or "TextInputFormat", or a custom format.

                 * **OutputFormat** *(string) --*

                   The output format: "SequenceFileOutputFormat"
                   (binary), or "IgnoreKeyTextOutputFormat", or a
                   custom format.

                 * **Compressed** *(boolean) --*

                   "True" if the data in the table is compressed, or
                   "False" if not.

                 * **NumberOfBuckets** *(integer) --*

                   Must be specified if the table contains any
                   dimension columns.

                 * **SerdeInfo** *(dict) --*

                   The serialization/deserialization (SerDe)
                   information.

                   * **Name** *(string) --*

                     Name of the SerDe.

                   * **SerializationLibrary** *(string) --*

                     Usually the class that implements the SerDe. An
                     example is "org.apache.hadoop.hive.serde2.column
                     ar.ColumnarSerDe".

                   * **Parameters** *(dict) --*

                     These key-value pairs define initialization
                     parameters for the SerDe.

                     * *(string) --*

                       * *(string) --*

                 * **BucketColumns** *(list) --*

                   A list of reducer grouping columns, clustering
                   columns, and bucketing columns in the table.

                   * *(string) --*

                 * **SortColumns** *(list) --*

                   A list specifying the sort order of each bucket in
                   the table.

                   * *(dict) --*

                     Specifies the sort order of a sorted column.

                     * **Column** *(string) --*

                       The name of the column.

                     * **SortOrder** *(integer) --*

                       Indicates that the column is sorted in
                       ascending order ( "== 1"), or in descending
                       order ( "==0").

                 * **Parameters** *(dict) --*

                   The user-supplied properties in key-value form.

                   * *(string) --*

                     * *(string) --*

                 * **SkewedInfo** *(dict) --*

                   The information about values that appear frequently
                   in a column (skewed values).

                   * **SkewedColumnNames** *(list) --*

                     A list of names of columns that contain skewed
                     values.

                     * *(string) --*

                   * **SkewedColumnValues** *(list) --*

                     A list of values that appear so frequently as to
                     be considered skewed.

                     * *(string) --*

                   * **SkewedColumnValueLocationMaps** *(dict) --*

                     A mapping of skewed values to the columns that
                     contain them.

                     * *(string) --*

                       * *(string) --*

                 * **StoredAsSubDirectories** *(boolean) --*

                   "True" if the table data is stored in
                   subdirectories, or "False" if not.

                 * **SchemaReference** *(dict) --*

                   An object that references a schema stored in the
                   Glue Schema Registry.

                   When creating a table, you can pass an empty list
                   of columns for the schema, and instead use a schema
                   reference.

                   * **SchemaId** *(dict) --*

                     A structure that contains schema identity fields.
                     Either this or the "SchemaVersionId" has to be
                     provided.

                     * **SchemaArn** *(string) --*

                       The Amazon Resource Name (ARN) of the schema.
                       One of "SchemaArn" or "SchemaName" has to be
                       provided.

                     * **SchemaName** *(string) --*

                       The name of the schema. One of "SchemaArn" or
                       "SchemaName" has to be provided.

                     * **RegistryName** *(string) --*

                       The name of the schema registry that contains
                       the schema.

                   * **SchemaVersionId** *(string) --*

                     The unique ID assigned to a version of the
                     schema. Either this or the "SchemaId" has to be
                     provided.

                   * **SchemaVersionNumber** *(integer) --*

                     The version number of the schema.

               * **PartitionKeys** *(list) --*

                 A list of columns by which the table is partitioned.
                 Only primitive types are supported as partition keys.

                 When you create a table used by Amazon Athena, and
                 you do not specify any "partitionKeys", you must at
                 least set the value of "partitionKeys" to an empty
                 list. For example:

                 ""PartitionKeys": []"

                 * *(dict) --*

                   A column in a "Table".

                   * **Name** *(string) --*

                     The name of the "Column".

                   * **Type** *(string) --*

                     The data type of the "Column".

                   * **Comment** *(string) --*

                     A free-form text comment.

                   * **Parameters** *(dict) --*

                     These key-value pairs define properties
                     associated with the column.

                     * *(string) --*

                       * *(string) --*

               * **ViewOriginalText** *(string) --*

                 Included for Apache Hive compatibility. Not used in
                 the normal course of Glue operations. If the table is
                 a "VIRTUAL_VIEW", certain Athena configuration
                 encoded in base64.

               * **ViewExpandedText** *(string) --*

                 Included for Apache Hive compatibility. Not used in
                 the normal course of Glue operations.

               * **TableType** *(string) --*

                 The type of this table. Glue will create tables with
                 the "EXTERNAL_TABLE" type. Other services, such as
                 Athena, may create tables with additional table
                 types.

                 Glue related table types:

                    EXTERNAL_TABLE

                 Hive compatible attribute - indicates a non-Hive
                 managed table.

                    GOVERNED

                 Used by Lake Formation. The Glue Data Catalog
                 understands "GOVERNED".

               * **Parameters** *(dict) --*

                 These key-value pairs define properties associated
                 with the table.

                 * *(string) --*

                   * *(string) --*

               * **CreatedBy** *(string) --*

                 The person or entity who created the table.

               * **IsRegisteredWithLakeFormation** *(boolean) --*

                 Indicates whether the table has been registered with
                 Lake Formation.

               * **TargetTable** *(dict) --*

                 A "TableIdentifier" structure that describes a target
                 table for resource linking.

                 * **CatalogId** *(string) --*

                   The ID of the Data Catalog in which the table
                   resides.

                 * **DatabaseName** *(string) --*

                   The name of the catalog database that contains the
                   target table.

                 * **Name** *(string) --*

                   The name of the target table.

                 * **Region** *(string) --*

                   Region of the target table.

               * **CatalogId** *(string) --*

                 The ID of the Data Catalog in which the table
                 resides.

               * **VersionId** *(string) --*

                 The ID of the table version.

               * **FederatedTable** *(dict) --*

                 A "FederatedTable" structure that references an
                 entity outside the Glue Data Catalog.

                 * **Identifier** *(string) --*

                   A unique identifier for the federated table.

                 * **DatabaseIdentifier** *(string) --*

                   A unique identifier for the federated database.

                 * **ConnectionName** *(string) --*

                   The name of the connection to the external
                   metastore.

                 * **ConnectionType** *(string) --*

                   The type of connection used to access the federated
                   table, specifying the protocol or method for
                   connecting to the external data source.

               * **ViewDefinition** *(dict) --*

                 A structure that contains all the information that
                 defines the view, including the dialect or dialects
                 for the view, and the query.

                 * **IsProtected** *(boolean) --*

                   You can set this flag as true to instruct the
                   engine not to push user-provided operations into
                   the logical plan of the view during query planning.
                   However, setting this flag does not guarantee that
                   the engine will comply. Refer to the engine's
                   documentation to understand the guarantees
                   provided, if any.

                 * **Definer** *(string) --*

                   The definer of a view in SQL.

                 * **SubObjects** *(list) --*

                   A list of table Amazon Resource Names (ARNs).

                   * *(string) --*

                 * **Representations** *(list) --*

                   A list of representations.

                   * *(dict) --*

                     A structure that contains the dialect of the
                     view, and the query that defines the view.

                     * **Dialect** *(string) --*

                       The dialect of the query engine.

                     * **DialectVersion** *(string) --*

                       The version of the dialect of the query engine.
                       For example, 3.0.0.

                     * **ViewOriginalText** *(string) --*

                       The "SELECT" query provided by the customer
                       during "CREATE VIEW DDL". This SQL is not used
                       during a query on a view ( "ViewExpandedText"
                       is used instead). "ViewOriginalText" is used
                       for cases like "SHOW CREATE VIEW" where users
                       want to see the original DDL command that
                       created the view.

                     * **ViewExpandedText** *(string) --*

                       The expanded SQL for the view. This SQL is used
                       by engines while processing a query on a view.
                       Engines may perform operations during view
                       creation to transform "ViewOriginalText" to
                       "ViewExpandedText". For example:

                       * Fully qualified identifiers: "SELECT * from
                         table1 -> SELECT * from db1.table1"

                     * **ValidationConnection** *(string) --*

                       The name of the connection to be used to
                       validate the specific representation of the
                       view.

                     * **IsStale** *(boolean) --*

                       Dialects marked as stale are no longer valid
                       and must be updated before they can be queried
                       in their respective query engines.

               * **IsMultiDialectView** *(boolean) --*

                 Specifies whether the view supports the SQL dialects
                 of one or more different query engines and can
                 therefore be read by those engines.

               * **Status** *(dict) --*

                 A structure containing information about the state of
                 an asynchronous change to a table.

                 * **RequestedBy** *(string) --*

                   The ARN of the user who requested the asynchronous
                   change.

                 * **UpdatedBy** *(string) --*

                   The ARN of the user to last manually alter the
                   asynchronous change (requesting cancellation, etc).

                 * **RequestTime** *(datetime) --*

                   An ISO 8601 formatted date string indicating the
                   time that the change was initiated.

                 * **UpdateTime** *(datetime) --*

                   An ISO 8601 formatted date string indicating the
                   time that the state was last updated.

                 * **Action** *(string) --*

                   Indicates which action was called on the table,
                   currently only "CREATE" or "UPDATE".

                 * **State** *(string) --*

                   A generic status for the change in progress, such
                   as QUEUED, IN_PROGRESS, SUCCESS, or FAILED.

                 * **Error** *(dict) --*

                   An error that will only appear when the state is
                   "FAILED". This is a parent level exception message,
                   there may be different >>``<<Error``s for each
                   dialect.

                   * **ErrorCode** *(string) --*

                     The code associated with this error.

                   * **ErrorMessage** *(string) --*

                     A message describing the error.

                 * **Details** *(dict) --*

                   A "StatusDetails" object with information about the
                   requested change.

                   * **RequestedChange** *(dict) --*

                     A "Table" object representing the requested
                     changes.

                   * **ViewValidations** *(list) --*

                     A list of "ViewValidation" objects that contain
                     information for an analytical engine to validate
                     a view.

                     * *(dict) --*

                       A structure that contains information for an
                       analytical engine to validate a view, prior to
                       persisting the view metadata. Used in the case
                       of direct "UpdateTable" or "CreateTable" API
                       calls.

                       * **Dialect** *(string) --*

                         The dialect of the query engine.

                       * **DialectVersion** *(string) --*

                         The version of the dialect of the query
                         engine. For example, 3.0.0.

                       * **ViewValidationText** *(string) --*

                         The "SELECT" query that defines the view, as
                         provided by the customer.

                       * **UpdateTime** *(datetime) --*

                         The time of the last update.

                       * **State** *(string) --*

                         The state of the validation.

                       * **Error** *(dict) --*

                         An error associated with the validation.

                         * **ErrorCode** *(string) --*

                           The code associated with this error.

                         * **ErrorMessage** *(string) --*

                           A message describing the error.
Glue / Client / list_connection_types


list_connection_types
*********************

Glue.Client.list_connection_types(**kwargs)

   The "ListConnectionTypes" API provides a discovery mechanism to
   learn available connection types in Glue. The response contains a
   list of connection types with high-level details of what is
   supported for each connection type. The connection types listed are
   the set of supported options for the "ConnectionType" value in the
   "CreateConnection" API.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_connection_types(
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **MaxResults** (*integer*) -- The maximum number of results to
        return.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ConnectionTypes': [
                 {
                     'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
                     'DisplayName': 'string',
                     'Vendor': 'string',
                     'Description': 'string',
                     'Categories': [
                         'string',
                     ],
                     'Capabilities': {
                         'SupportedAuthenticationTypes': [
                             'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                         ],
                         'SupportedDataOperations': [
                             'READ'|'WRITE',
                         ],
                         'SupportedComputeEnvironments': [
                             'SPARK'|'ATHENA'|'PYTHON',
                         ]
                     },
                     'LogoUrl': 'string',
                     'ConnectionTypeVariants': [
                         {
                             'ConnectionTypeVariantName': 'string',
                             'DisplayName': 'string',
                             'Description': 'string',
                             'LogoUrl': 'string'
                         },
                     ]
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **ConnectionTypes** *(list) --*

          A list of "ConnectionTypeBrief" objects containing brief
          information about the supported connection types.

          * *(dict) --*

            Brief information about a supported connection type
            returned by the "ListConnectionTypes" API.

            * **ConnectionType** *(string) --*

              The name of the connection type.

            * **DisplayName** *(string) --*

              The human-readable name for the connection type that is
              displayed in the Glue console.

            * **Vendor** *(string) --*

              The name of the vendor or provider that created or
              maintains this connection type.

            * **Description** *(string) --*

              A description of the connection type.

            * **Categories** *(list) --*

              A list of categories that this connection type belongs
              to. Categories help users filter and find appropriate
              connection types based on their use cases.

              * *(string) --*

            * **Capabilities** *(dict) --*

              The supported authentication types, data interface types
              (compute environments), and data operations of the
              connector.

              * **SupportedAuthenticationTypes** *(list) --*

                A list of supported authentication types.

                * *(string) --*

              * **SupportedDataOperations** *(list) --*

                A list of supported data operations.

                * *(string) --*

              * **SupportedComputeEnvironments** *(list) --*

                A list of supported compute environments.

                * *(string) --*

            * **LogoUrl** *(string) --*

              The URL of the logo associated with a connection type.

            * **ConnectionTypeVariants** *(list) --*

              A list of variants available for this connection type.
              Different variants may provide specialized
              configurations for specific use cases or implementations
              of the same general connection type.

              * *(dict) --*

                Represents a variant of a connection type in Glue Data
                Catalog. Connection type variants provide specific
                configurations and behaviors for different
                implementations of the same general connection type.

                * **ConnectionTypeVariantName** *(string) --*

                  The unique identifier for the connection type
                  variant. This name is used internally to identify
                  the specific variant of a connection type.

                * **DisplayName** *(string) --*

                  The human-readable name for the connection type
                  variant that is displayed in the Glue console.

                * **Description** *(string) --*

                  A detailed description of the connection type
                  variant, including its purpose, use cases, and any
                  specific configuration requirements.

                * **LogoUrl** *(string) --*

                  The URL of the logo associated with a connection
                  type variant.

        * **NextToken** *(string) --*

          A continuation token, if the current list segment is not the
          last.

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.AccessDeniedException"
Glue / Client / delete_job


delete_job
**********

Glue.Client.delete_job(**kwargs)

   Deletes a specified job definition. If the job definition is not
   found, no exception is thrown.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_job(
          JobName='string'
      )

   Parameters:
      **JobName** (*string*) --

      **[REQUIRED]**

      The name of the job definition to delete.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobName': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **JobName** *(string) --*

          The name of the job definition that was deleted.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / update_trigger


update_trigger
**************

Glue.Client.update_trigger(**kwargs)

   Updates a trigger definition.

   Job arguments may be logged. Do not pass plaintext secrets as
   arguments. Retrieve secrets from a Glue Connection, Amazon Web
   Services Secrets Manager or other secret management mechanism if
   you intend to keep them within the Job.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_trigger(
          Name='string',
          TriggerUpdate={
              'Name': 'string',
              'Description': 'string',
              'Schedule': 'string',
              'Actions': [
                  {
                      'JobName': 'string',
                      'Arguments': {
                          'string': 'string'
                      },
                      'Timeout': 123,
                      'SecurityConfiguration': 'string',
                      'NotificationProperty': {
                          'NotifyDelayAfter': 123
                      },
                      'CrawlerName': 'string'
                  },
              ],
              'Predicate': {
                  'Logical': 'AND'|'ANY',
                  'Conditions': [
                      {
                          'LogicalOperator': 'EQUALS',
                          'JobName': 'string',
                          'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                          'CrawlerName': 'string',
                          'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                      },
                  ]
              },
              'EventBatchingCondition': {
                  'BatchSize': 123,
                  'BatchWindow': 123
              }
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the trigger to update.

      * **TriggerUpdate** (*dict*) --

        **[REQUIRED]**

        The new values with which to update the trigger.

        * **Name** *(string) --*

          Reserved for future use.

        * **Description** *(string) --*

          A description of this trigger.

        * **Schedule** *(string) --*

          A "cron" expression used to specify the schedule (see Time-
          Based Schedules for Jobs and Crawlers. For example, to run
          something every day at 12:15 UTC, you would specify:
          "cron(15 12 * * ? *)".

        * **Actions** *(list) --*

          The actions initiated by this trigger.

          * *(dict) --*

            Defines an action to be initiated by a trigger.

            * **JobName** *(string) --*

              The name of a job to be run.

            * **Arguments** *(dict) --*

              The job arguments used when this trigger fires. For this
              job run, they replace the default arguments set in the
              job definition itself.

              You can specify arguments here that your own job-
              execution script consumes, as well as arguments that
              Glue itself consumes.

              For information about how to specify and consume your
              own Job arguments, see the Calling Glue APIs in Python
              topic in the developer guide.

              For information about the key-value pairs that Glue
              consumes to set up your job, see the Special Parameters
              Used by Glue topic in the developer guide.

              * *(string) --*

                * *(string) --*

            * **Timeout** *(integer) --*

              The "JobRun" timeout in minutes. This is the maximum
              time that a job run can consume resources before it is
              terminated and enters "TIMEOUT" status. This overrides
              the timeout value set in the parent job.

              Jobs must have timeout values less than 7 days or 10080
              minutes. Otherwise, the jobs will throw an exception.

              When the value is left blank, the timeout is defaulted
              to 2880 minutes.

              Any existing Glue jobs that had a timeout value greater
              than 7 days will be defaulted to 7 days. For instance if
              you have specified a timeout of 20 days for a batch job,
              it will be stopped on the 7th day.

              For streaming jobs, if you have set up a maintenance
              window, it will be restarted during the maintenance
              window after 7 days.

            * **SecurityConfiguration** *(string) --*

              The name of the "SecurityConfiguration" structure to be
              used with this action.

            * **NotificationProperty** *(dict) --*

              Specifies configuration properties of a job run
              notification.

              * **NotifyDelayAfter** *(integer) --*

                After a job run starts, the number of minutes to wait
                before sending a job run delay notification.

            * **CrawlerName** *(string) --*

              The name of the crawler to be used with this action.

        * **Predicate** *(dict) --*

          The predicate of this trigger, which defines when it will
          fire.

          * **Logical** *(string) --*

            An optional field if only one condition is listed. If
            multiple conditions are listed, then this field is
            required.

          * **Conditions** *(list) --*

            A list of the conditions that determine when the trigger
            will fire.

            * *(dict) --*

              Defines a condition under which a trigger fires.

              * **LogicalOperator** *(string) --*

                A logical operator.

              * **JobName** *(string) --*

                The name of the job whose "JobRuns" this condition
                applies to, and on which this trigger waits.

              * **State** *(string) --*

                The condition state. Currently, the only job states
                that a trigger can listen for are "SUCCEEDED",
                "STOPPED", "FAILED", and "TIMEOUT". The only crawler
                states that a trigger can listen for are "SUCCEEDED",
                "FAILED", and "CANCELLED".

              * **CrawlerName** *(string) --*

                The name of the crawler to which this condition
                applies.

              * **CrawlState** *(string) --*

                The state of the crawler to which this condition
                applies.

        * **EventBatchingCondition** *(dict) --*

          Batch condition that must be met (specified number of events
          received or batch time window expired) before EventBridge
          event trigger fires.

          * **BatchSize** *(integer) --* **[REQUIRED]**

            Number of events that must be received from Amazon
            EventBridge before EventBridge event trigger fires.

          * **BatchWindow** *(integer) --*

            Window of time in seconds after which EventBridge event
            trigger fires. Window starts when first event is received.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Trigger': {
                 'Name': 'string',
                 'WorkflowName': 'string',
                 'Id': 'string',
                 'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                 'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                 'Description': 'string',
                 'Schedule': 'string',
                 'Actions': [
                     {
                         'JobName': 'string',
                         'Arguments': {
                             'string': 'string'
                         },
                         'Timeout': 123,
                         'SecurityConfiguration': 'string',
                         'NotificationProperty': {
                             'NotifyDelayAfter': 123
                         },
                         'CrawlerName': 'string'
                     },
                 ],
                 'Predicate': {
                     'Logical': 'AND'|'ANY',
                     'Conditions': [
                         {
                             'LogicalOperator': 'EQUALS',
                             'JobName': 'string',
                             'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                             'CrawlerName': 'string',
                             'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                         },
                     ]
                 },
                 'EventBatchingCondition': {
                     'BatchSize': 123,
                     'BatchWindow': 123
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Trigger** *(dict) --*

          The resulting trigger definition.

          * **Name** *(string) --*

            The name of the trigger.

          * **WorkflowName** *(string) --*

            The name of the workflow associated with the trigger.

          * **Id** *(string) --*

            Reserved for future use.

          * **Type** *(string) --*

            The type of trigger that this is.

          * **State** *(string) --*

            The current state of the trigger.

          * **Description** *(string) --*

            A description of this trigger.

          * **Schedule** *(string) --*

            A "cron" expression used to specify the schedule (see
            Time-Based Schedules for Jobs and Crawlers. For example,
            to run something every day at 12:15 UTC, you would
            specify: "cron(15 12 * * ? *)".

          * **Actions** *(list) --*

            The actions initiated by this trigger.

            * *(dict) --*

              Defines an action to be initiated by a trigger.

              * **JobName** *(string) --*

                The name of a job to be run.

              * **Arguments** *(dict) --*

                The job arguments used when this trigger fires. For
                this job run, they replace the default arguments set
                in the job definition itself.

                You can specify arguments here that your own job-
                execution script consumes, as well as arguments that
                Glue itself consumes.

                For information about how to specify and consume your
                own Job arguments, see the Calling Glue APIs in Python
                topic in the developer guide.

                For information about the key-value pairs that Glue
                consumes to set up your job, see the Special
                Parameters Used by Glue topic in the developer guide.

                * *(string) --*

                  * *(string) --*

              * **Timeout** *(integer) --*

                The "JobRun" timeout in minutes. This is the maximum
                time that a job run can consume resources before it is
                terminated and enters "TIMEOUT" status. This overrides
                the timeout value set in the parent job.

                Jobs must have timeout values less than 7 days or
                10080 minutes. Otherwise, the jobs will throw an
                exception.

                When the value is left blank, the timeout is defaulted
                to 2880 minutes.

                Any existing Glue jobs that had a timeout value
                greater than 7 days will be defaulted to 7 days. For
                instance if you have specified a timeout of 20 days
                for a batch job, it will be stopped on the 7th day.

                For streaming jobs, if you have set up a maintenance
                window, it will be restarted during the maintenance
                window after 7 days.

              * **SecurityConfiguration** *(string) --*

                The name of the "SecurityConfiguration" structure to
                be used with this action.

              * **NotificationProperty** *(dict) --*

                Specifies configuration properties of a job run
                notification.

                * **NotifyDelayAfter** *(integer) --*

                  After a job run starts, the number of minutes to
                  wait before sending a job run delay notification.

              * **CrawlerName** *(string) --*

                The name of the crawler to be used with this action.

          * **Predicate** *(dict) --*

            The predicate of this trigger, which defines when it will
            fire.

            * **Logical** *(string) --*

              An optional field if only one condition is listed. If
              multiple conditions are listed, then this field is
              required.

            * **Conditions** *(list) --*

              A list of the conditions that determine when the trigger
              will fire.

              * *(dict) --*

                Defines a condition under which a trigger fires.

                * **LogicalOperator** *(string) --*

                  A logical operator.

                * **JobName** *(string) --*

                  The name of the job whose "JobRuns" this condition
                  applies to, and on which this trigger waits.

                * **State** *(string) --*

                  The condition state. Currently, the only job states
                  that a trigger can listen for are "SUCCEEDED",
                  "STOPPED", "FAILED", and "TIMEOUT". The only crawler
                  states that a trigger can listen for are
                  "SUCCEEDED", "FAILED", and "CANCELLED".

                * **CrawlerName** *(string) --*

                  The name of the crawler to which this condition
                  applies.

                * **CrawlState** *(string) --*

                  The state of the crawler to which this condition
                  applies.

          * **EventBatchingCondition** *(dict) --*

            Batch condition that must be met (specified number of
            events received or batch time window expired) before
            EventBridge event trigger fires.

            * **BatchSize** *(integer) --*

              Number of events that must be received from Amazon
              EventBridge before EventBridge event trigger fires.

            * **BatchWindow** *(integer) --*

              Window of time in seconds after which EventBridge event
              trigger fires. Window starts when first event is
              received.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / get_ml_task_runs


get_ml_task_runs
****************

Glue.Client.get_ml_task_runs(**kwargs)

   Gets a list of runs for a machine learning transform. Machine
   learning task runs are asynchronous tasks that Glue runs on your
   behalf as part of various machine learning workflows. You can get a
   sortable, filterable list of machine learning task runs by calling
   "GetMLTaskRuns" with their parent transform's "TransformID" and
   other optional parameters as documented in this section.

   This operation returns a list of historic runs and must be
   paginated.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_ml_task_runs(
          TransformId='string',
          NextToken='string',
          MaxResults=123,
          Filter={
              'TaskRunType': 'EVALUATION'|'LABELING_SET_GENERATION'|'IMPORT_LABELS'|'EXPORT_LABELS'|'FIND_MATCHES',
              'Status': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT',
              'StartedBefore': datetime(2015, 1, 1),
              'StartedAfter': datetime(2015, 1, 1)
          },
          Sort={
              'Column': 'TASK_RUN_TYPE'|'STATUS'|'STARTED',
              'SortDirection': 'DESCENDING'|'ASCENDING'
          }
      )

   Parameters:
      * **TransformId** (*string*) --

        **[REQUIRED]**

        The unique identifier of the machine learning transform.

      * **NextToken** (*string*) -- A token for pagination of the
        results. The default is empty.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return.

      * **Filter** (*dict*) --

        The filter criteria, in the "TaskRunFilterCriteria" structure,
        for the task run.

        * **TaskRunType** *(string) --*

          The type of task run.

        * **Status** *(string) --*

          The current status of the task run.

        * **StartedBefore** *(datetime) --*

          Filter on task runs started before this date.

        * **StartedAfter** *(datetime) --*

          Filter on task runs started after this date.

      * **Sort** (*dict*) --

        The sorting criteria, in the "TaskRunSortCriteria" structure,
        for the task run.

        * **Column** *(string) --* **[REQUIRED]**

          The column to be used to sort the list of task runs for the
          machine learning transform.

        * **SortDirection** *(string) --* **[REQUIRED]**

          The sort direction to be used to sort the list of task runs
          for the machine learning transform.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TaskRuns': [
                 {
                     'TransformId': 'string',
                     'TaskRunId': 'string',
                     'Status': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT',
                     'LogGroupName': 'string',
                     'Properties': {
                         'TaskType': 'EVALUATION'|'LABELING_SET_GENERATION'|'IMPORT_LABELS'|'EXPORT_LABELS'|'FIND_MATCHES',
                         'ImportLabelsTaskRunProperties': {
                             'InputS3Path': 'string',
                             'Replace': True|False
                         },
                         'ExportLabelsTaskRunProperties': {
                             'OutputS3Path': 'string'
                         },
                         'LabelingSetGenerationTaskRunProperties': {
                             'OutputS3Path': 'string'
                         },
                         'FindMatchesTaskRunProperties': {
                             'JobId': 'string',
                             'JobName': 'string',
                             'JobRunId': 'string'
                         }
                     },
                     'ErrorString': 'string',
                     'StartedOn': datetime(2015, 1, 1),
                     'LastModifiedOn': datetime(2015, 1, 1),
                     'CompletedOn': datetime(2015, 1, 1),
                     'ExecutionTime': 123
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TaskRuns** *(list) --*

          A list of task runs that are associated with the transform.

          * *(dict) --*

            The sampling parameters that are associated with the
            machine learning transform.

            * **TransformId** *(string) --*

              The unique identifier for the transform.

            * **TaskRunId** *(string) --*

              The unique identifier for this task run.

            * **Status** *(string) --*

              The current status of the requested task run.

            * **LogGroupName** *(string) --*

              The names of the log group for secure logging,
              associated with this task run.

            * **Properties** *(dict) --*

              Specifies configuration properties associated with this
              task run.

              * **TaskType** *(string) --*

                The type of task run.

              * **ImportLabelsTaskRunProperties** *(dict) --*

                The configuration properties for an importing labels
                task run.

                * **InputS3Path** *(string) --*

                  The Amazon Simple Storage Service (Amazon S3) path
                  from where you will import the labels.

                * **Replace** *(boolean) --*

                  Indicates whether to overwrite your existing labels.

              * **ExportLabelsTaskRunProperties** *(dict) --*

                The configuration properties for an exporting labels
                task run.

                * **OutputS3Path** *(string) --*

                  The Amazon Simple Storage Service (Amazon S3) path
                  where you will export the labels.

              * **LabelingSetGenerationTaskRunProperties** *(dict) --*

                The configuration properties for a labeling set
                generation task run.

                * **OutputS3Path** *(string) --*

                  The Amazon Simple Storage Service (Amazon S3) path
                  where you will generate the labeling set.

              * **FindMatchesTaskRunProperties** *(dict) --*

                The configuration properties for a find matches task
                run.

                * **JobId** *(string) --*

                  The job ID for the Find Matches task run.

                * **JobName** *(string) --*

                  The name assigned to the job for the Find Matches
                  task run.

                * **JobRunId** *(string) --*

                  The job run ID for the Find Matches task run.

            * **ErrorString** *(string) --*

              The list of error strings associated with this task run.

            * **StartedOn** *(datetime) --*

              The date and time that this task run started.

            * **LastModifiedOn** *(datetime) --*

              The last point in time that the requested task run was
              updated.

            * **CompletedOn** *(datetime) --*

              The last point in time that the requested task run was
              completed.

            * **ExecutionTime** *(integer) --*

              The amount of time (in seconds) that the task run
              consumed resources.

        * **NextToken** *(string) --*

          A pagination token, if more results are available.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / put_data_quality_profile_annotation


put_data_quality_profile_annotation
***********************************

Glue.Client.put_data_quality_profile_annotation(**kwargs)

   Annotate all datapoints for a Profile.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.put_data_quality_profile_annotation(
          ProfileId='string',
          InclusionAnnotation='INCLUDE'|'EXCLUDE'
      )

   Parameters:
      * **ProfileId** (*string*) --

        **[REQUIRED]**

        The ID of the data quality monitoring profile to annotate.

      * **InclusionAnnotation** (*string*) --

        **[REQUIRED]**

        The inclusion annotation value to apply to the profile.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

        Left blank.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / create_database


create_database
***************

Glue.Client.create_database(**kwargs)

   Creates a new database in a Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_database(
          CatalogId='string',
          DatabaseInput={
              'Name': 'string',
              'Description': 'string',
              'LocationUri': 'string',
              'Parameters': {
                  'string': 'string'
              },
              'CreateTableDefaultPermissions': [
                  {
                      'Principal': {
                          'DataLakePrincipalIdentifier': 'string'
                      },
                      'Permissions': [
                          'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                      ]
                  },
              ],
              'TargetDatabase': {
                  'CatalogId': 'string',
                  'DatabaseName': 'string',
                  'Region': 'string'
              },
              'FederatedDatabase': {
                  'Identifier': 'string',
                  'ConnectionName': 'string',
                  'ConnectionType': 'string'
              }
          },
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which to create the database. If none is provided, the Amazon
        Web Services account ID is used by default.

      * **DatabaseInput** (*dict*) --

        **[REQUIRED]**

        The metadata for the database.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the database. For Hive compatibility, this is
          folded to lowercase when it is stored.

        * **Description** *(string) --*

          A description of the database.

        * **LocationUri** *(string) --*

          The location of the database (for example, an HDFS path).

        * **Parameters** *(dict) --*

          These key-value pairs define parameters and properties of
          the database.

          These key-value pairs define parameters and properties of
          the database.

          * *(string) --*

            * *(string) --*

        * **CreateTableDefaultPermissions** *(list) --*

          Creates a set of default permissions on the table for
          principals. Used by Lake Formation. Not used in the normal
          course of Glue operations.

          * *(dict) --*

            Permissions granted to a principal.

            * **Principal** *(dict) --*

              The principal who is granted permissions.

              * **DataLakePrincipalIdentifier** *(string) --*

                An identifier for the Lake Formation principal.

            * **Permissions** *(list) --*

              The permissions that are granted to the principal.

              * *(string) --*

        * **TargetDatabase** *(dict) --*

          A "DatabaseIdentifier" structure that describes a target
          database for resource linking.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the database resides.

          * **DatabaseName** *(string) --*

            The name of the catalog database.

          * **Region** *(string) --*

            Region of the target database.

        * **FederatedDatabase** *(dict) --*

          A "FederatedDatabase" structure that references an entity
          outside the Glue Data Catalog.

          * **Identifier** *(string) --*

            A unique identifier for the federated database.

          * **ConnectionName** *(string) --*

            The name of the connection to the external metastore.

          * **ConnectionType** *(string) --*

            The type of connection used to access the federated
            database, such as JDBC, ODBC, or other supported
            connection protocols.

      * **Tags** (*dict*) --

        The tags you assign to the database.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.FederatedResourceAlreadyExistsException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / test_connection


test_connection
***************

Glue.Client.test_connection(**kwargs)

   Tests a connection to a service to validate the service credentials
   that you provide.

   You can either provide an existing connection name or a
   "TestConnectionInput" for testing a non-existing connection input.
   Providing both at the same time will cause an error.

   If the action is successful, the service sends back an HTTP 200
   response.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.test_connection(
          ConnectionName='string',
          CatalogId='string',
          TestConnectionInput={
              'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
              'ConnectionProperties': {
                  'string': 'string'
              },
              'AuthenticationConfiguration': {
                  'AuthenticationType': 'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                  'OAuth2Properties': {
                      'OAuth2GrantType': 'AUTHORIZATION_CODE'|'CLIENT_CREDENTIALS'|'JWT_BEARER',
                      'OAuth2ClientApplication': {
                          'UserManagedClientApplicationClientId': 'string',
                          'AWSManagedClientApplicationReference': 'string'
                      },
                      'TokenUrl': 'string',
                      'TokenUrlParametersMap': {
                          'string': 'string'
                      },
                      'AuthorizationCodeProperties': {
                          'AuthorizationCode': 'string',
                          'RedirectUri': 'string'
                      },
                      'OAuth2Credentials': {
                          'UserManagedClientApplicationClientSecret': 'string',
                          'AccessToken': 'string',
                          'RefreshToken': 'string',
                          'JwtToken': 'string'
                      }
                  },
                  'SecretArn': 'string',
                  'KmsKeyArn': 'string',
                  'BasicAuthenticationCredentials': {
                      'Username': 'string',
                      'Password': 'string'
                  },
                  'CustomAuthenticationCredentials': {
                      'string': 'string'
                  }
              }
          }
      )

   Parameters:
      * **ConnectionName** (*string*) -- Optional. The name of the
        connection to test. If only name is provided, the operation
        will get the connection and use that for testing.

      * **CatalogId** (*string*) -- The catalog ID where the
        connection resides.

      * **TestConnectionInput** (*dict*) --

        A structure that is used to specify testing a connection to a
        service.

        * **ConnectionType** *(string) --* **[REQUIRED]**

          The type of connection to test. This operation is only
          available for the "JDBC" or "SALESFORCE" connection types.

        * **ConnectionProperties** *(dict) --* **[REQUIRED]**

          The key-value pairs that define parameters for the
          connection.

          JDBC connections use the following connection properties:

          * Required: All of ( "HOST", "PORT", "JDBC_ENGINE") or
            "JDBC_CONNECTION_URL".

          * Required: All of ( "USERNAME", "PASSWORD") or "SECRET_ID".

          * Optional: "JDBC_ENFORCE_SSL", "CUSTOM_JDBC_CERT",
            "CUSTOM_JDBC_CERT_STRING",
            "SKIP_CUSTOM_JDBC_CERT_VALIDATION". These parameters are
            used to configure SSL with JDBC.

          SALESFORCE connections require the
          "AuthenticationConfiguration" member to be configured.

          * *(string) --*

            * *(string) --*

        * **AuthenticationConfiguration** *(dict) --*

          A structure containing the authentication configuration in
          the TestConnection request. Required for a connection to
          Salesforce using OAuth authentication.

          * **AuthenticationType** *(string) --*

            A structure containing the authentication configuration in
            the CreateConnection request.

          * **OAuth2Properties** *(dict) --*

            The properties for OAuth2 authentication in the
            CreateConnection request.

            * **OAuth2GrantType** *(string) --*

              The OAuth2 grant type in the CreateConnection request.
              For example, "AUTHORIZATION_CODE", "JWT_BEARER", or
              "CLIENT_CREDENTIALS".

            * **OAuth2ClientApplication** *(dict) --*

              The client application type in the CreateConnection
              request. For example, "AWS_MANAGED" or "USER_MANAGED".

              * **UserManagedClientApplicationClientId** *(string) --*

                The client application clientID if the ClientAppType
                is "USER_MANAGED".

              * **AWSManagedClientApplicationReference** *(string) --*

                The reference to the SaaS-side client app that is
                Amazon Web Services managed.

            * **TokenUrl** *(string) --*

              The URL of the provider's authentication server, to
              exchange an authorization code for an access token.

            * **TokenUrlParametersMap** *(dict) --*

              A map of parameters that are added to the token "GET"
              request.

              * *(string) --*

                * *(string) --*

            * **AuthorizationCodeProperties** *(dict) --*

              The set of properties required for the the OAuth2
              "AUTHORIZATION_CODE" grant type.

              * **AuthorizationCode** *(string) --*

                An authorization code to be used in the third leg of
                the "AUTHORIZATION_CODE" grant workflow. This is a
                single-use code which becomes invalid once exchanged
                for an access token, thus it is acceptable to have
                this value as a request parameter.

              * **RedirectUri** *(string) --*

                The redirect URI where the user gets redirected to by
                authorization server when issuing an authorization
                code. The URI is subsequently used when the
                authorization code is exchanged for an access token.

            * **OAuth2Credentials** *(dict) --*

              The credentials used when the authentication type is
              OAuth2 authentication.

              * **UserManagedClientApplicationClientSecret** *(string)
                --*

                The client application client secret if the client
                application is user managed.

              * **AccessToken** *(string) --*

                The access token used when the authentication type is
                OAuth2.

              * **RefreshToken** *(string) --*

                The refresh token used when the authentication type is
                OAuth2.

              * **JwtToken** *(string) --*

                The JSON Web Token (JWT) used when the authentication
                type is OAuth2.

          * **SecretArn** *(string) --*

            The secret manager ARN to store credentials in the
            CreateConnection request.

          * **KmsKeyArn** *(string) --*

            The ARN of the KMS key used to encrypt the connection.
            Only taken an as input in the request and stored in the
            Secret Manager.

          * **BasicAuthenticationCredentials** *(dict) --*

            The credentials used when the authentication type is basic
            authentication.

            * **Username** *(string) --*

              The username to connect to the data source.

            * **Password** *(string) --*

              The password to connect to the data source.

          * **CustomAuthenticationCredentials** *(dict) --*

            The credentials used when the authentication type is
            custom authentication.

            * *(string) --*

              * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ConflictException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / batch_get_data_quality_result


batch_get_data_quality_result
*****************************

Glue.Client.batch_get_data_quality_result(**kwargs)

   Retrieves a list of data quality results for the specified result
   IDs.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_data_quality_result(
          ResultIds=[
              'string',
          ]
      )

   Parameters:
      **ResultIds** (*list*) --

      **[REQUIRED]**

      A list of unique result IDs for the data quality results.

      * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Results': [
                 {
                     'ResultId': 'string',
                     'ProfileId': 'string',
                     'Score': 123.0,
                     'DataSource': {
                         'GlueTable': {
                             'DatabaseName': 'string',
                             'TableName': 'string',
                             'CatalogId': 'string',
                             'ConnectionName': 'string',
                             'AdditionalOptions': {
                                 'string': 'string'
                             }
                         }
                     },
                     'RulesetName': 'string',
                     'EvaluationContext': 'string',
                     'StartedOn': datetime(2015, 1, 1),
                     'CompletedOn': datetime(2015, 1, 1),
                     'JobName': 'string',
                     'JobRunId': 'string',
                     'RulesetEvaluationRunId': 'string',
                     'RuleResults': [
                         {
                             'Name': 'string',
                             'Description': 'string',
                             'EvaluationMessage': 'string',
                             'Result': 'PASS'|'FAIL'|'ERROR',
                             'EvaluatedMetrics': {
                                 'string': 123.0
                             },
                             'EvaluatedRule': 'string',
                             'RuleMetrics': {
                                 'string': 123.0
                             }
                         },
                     ],
                     'AnalyzerResults': [
                         {
                             'Name': 'string',
                             'Description': 'string',
                             'EvaluationMessage': 'string',
                             'EvaluatedMetrics': {
                                 'string': 123.0
                             }
                         },
                     ],
                     'Observations': [
                         {
                             'Description': 'string',
                             'MetricBasedObservation': {
                                 'MetricName': 'string',
                                 'StatisticId': 'string',
                                 'MetricValues': {
                                     'ActualValue': 123.0,
                                     'ExpectedValue': 123.0,
                                     'LowerLimit': 123.0,
                                     'UpperLimit': 123.0
                                 },
                                 'NewRules': [
                                     'string',
                                 ]
                             }
                         },
                     ],
                     'AggregatedMetrics': {
                         'TotalRowsProcessed': 123.0,
                         'TotalRowsPassed': 123.0,
                         'TotalRowsFailed': 123.0,
                         'TotalRulesProcessed': 123.0,
                         'TotalRulesPassed': 123.0,
                         'TotalRulesFailed': 123.0
                     }
                 },
             ],
             'ResultsNotFound': [
                 'string',
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Results** *(list) --*

          A list of "DataQualityResult" objects representing the data
          quality results.

          * *(dict) --*

            Describes a data quality result.

            * **ResultId** *(string) --*

              A unique result ID for the data quality result.

            * **ProfileId** *(string) --*

              The Profile ID for the data quality result.

            * **Score** *(float) --*

              An aggregate data quality score. Represents the ratio of
              rules that passed to the total number of rules.

            * **DataSource** *(dict) --*

              The table associated with the data quality result, if
              any.

              * **GlueTable** *(dict) --*

                An Glue table.

                * **DatabaseName** *(string) --*

                  A database name in the Glue Data Catalog.

                * **TableName** *(string) --*

                  A table name in the Glue Data Catalog.

                * **CatalogId** *(string) --*

                  A unique identifier for the Glue Data Catalog.

                * **ConnectionName** *(string) --*

                  The name of the connection to the Glue Data Catalog.

                * **AdditionalOptions** *(dict) --*

                  Additional options for the table. Currently there
                  are two keys supported:

                  * "pushDownPredicate": to filter on partitions
                    without having to list and read all the files in
                    your dataset.

                  * "catalogPartitionPredicate": to use server-side
                    partition pruning using partition indexes in the
                    Glue Data Catalog.

                  * *(string) --*

                    * *(string) --*

            * **RulesetName** *(string) --*

              The name of the ruleset associated with the data quality
              result.

            * **EvaluationContext** *(string) --*

              In the context of a job in Glue Studio, each node in the
              canvas is typically assigned some sort of name and data
              quality nodes will have names. In the case of multiple
              nodes, the "evaluationContext" can differentiate the
              nodes.

            * **StartedOn** *(datetime) --*

              The date and time when this data quality run started.

            * **CompletedOn** *(datetime) --*

              The date and time when this data quality run completed.

            * **JobName** *(string) --*

              The job name associated with the data quality result, if
              any.

            * **JobRunId** *(string) --*

              The job run ID associated with the data quality result,
              if any.

            * **RulesetEvaluationRunId** *(string) --*

              The unique run ID for the ruleset evaluation for this
              data quality result.

            * **RuleResults** *(list) --*

              A list of "DataQualityRuleResult" objects representing
              the results for each rule.

              * *(dict) --*

                Describes the result of the evaluation of a data
                quality rule.

                * **Name** *(string) --*

                  The name of the data quality rule.

                * **Description** *(string) --*

                  A description of the data quality rule.

                * **EvaluationMessage** *(string) --*

                  An evaluation message.

                * **Result** *(string) --*

                  A pass or fail status for the rule.

                * **EvaluatedMetrics** *(dict) --*

                  A map of metrics associated with the evaluation of
                  the rule.

                  * *(string) --*

                    * *(float) --*

                * **EvaluatedRule** *(string) --*

                  The evaluated rule.

                * **RuleMetrics** *(dict) --*

                  A map containing metrics associated with the
                  evaluation of the rule based on row-level results.

                  * *(string) --*

                    * *(float) --*

            * **AnalyzerResults** *(list) --*

              A list of "DataQualityAnalyzerResult" objects
              representing the results for each analyzer.

              * *(dict) --*

                Describes the result of the evaluation of a data
                quality analyzer.

                * **Name** *(string) --*

                  The name of the data quality analyzer.

                * **Description** *(string) --*

                  A description of the data quality analyzer.

                * **EvaluationMessage** *(string) --*

                  An evaluation message.

                * **EvaluatedMetrics** *(dict) --*

                  A map of metrics associated with the evaluation of
                  the analyzer.

                  * *(string) --*

                    * *(float) --*

            * **Observations** *(list) --*

              A list of "DataQualityObservation" objects representing
              the observations generated after evaluating the rules
              and analyzers.

              * *(dict) --*

                Describes the observation generated after evaluating
                the rules and analyzers.

                * **Description** *(string) --*

                  A description of the data quality observation.

                * **MetricBasedObservation** *(dict) --*

                  An object of type "MetricBasedObservation"
                  representing the observation that is based on
                  evaluated data quality metrics.

                  * **MetricName** *(string) --*

                    The name of the data quality metric used for
                    generating the observation.

                  * **StatisticId** *(string) --*

                    The Statistic ID.

                  * **MetricValues** *(dict) --*

                    An object of type "DataQualityMetricValues"
                    representing the analysis of the data quality
                    metric value.

                    * **ActualValue** *(float) --*

                      The actual value of the data quality metric.

                    * **ExpectedValue** *(float) --*

                      The expected value of the data quality metric
                      according to the analysis of historical data.

                    * **LowerLimit** *(float) --*

                      The lower limit of the data quality metric value
                      according to the analysis of historical data.

                    * **UpperLimit** *(float) --*

                      The upper limit of the data quality metric value
                      according to the analysis of historical data.

                  * **NewRules** *(list) --*

                    A list of new data quality rules generated as part
                    of the observation based on the data quality
                    metric value.

                    * *(string) --*

            * **AggregatedMetrics** *(dict) --*

              A summary of "DataQualityAggregatedMetrics" objects
              showing the total counts of processed rows and rules,
              including their pass/fail statistics based on row-level
              results.

              * **TotalRowsProcessed** *(float) --*

                The total number of rows that were processed during
                the data quality evaluation.

              * **TotalRowsPassed** *(float) --*

                The total number of rows that passed all applicable
                data quality rules.

              * **TotalRowsFailed** *(float) --*

                The total number of rows that failed one or more data
                quality rules.

              * **TotalRulesProcessed** *(float) --*

                The total number of data quality rules that were
                evaluated.

              * **TotalRulesPassed** *(float) --*

                The total number of data quality rules that passed
                their evaluation criteria.

              * **TotalRulesFailed** *(float) --*

                The total number of data quality rules that failed
                their evaluation criteria.

        * **ResultsNotFound** *(list) --*

          A list of result IDs for which results were not found.

          * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / update_data_quality_ruleset


update_data_quality_ruleset
***************************

Glue.Client.update_data_quality_ruleset(**kwargs)

   Updates the specified data quality ruleset.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_data_quality_ruleset(
          Name='string',
          Description='string',
          Ruleset='string'
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the data quality ruleset.

      * **Description** (*string*) -- A description of the ruleset.

      * **Ruleset** (*string*) -- A Data Quality Definition Language
        (DQDL) ruleset. For more information, see the Glue developer
        guide.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string',
             'Description': 'string',
             'Ruleset': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the data quality ruleset.

        * **Description** *(string) --*

          A description of the ruleset.

        * **Ruleset** *(string) --*

          A Data Quality Definition Language (DQDL) ruleset. For more
          information, see the Glue developer guide.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.IdempotentParameterMismatchException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / batch_get_custom_entity_types


batch_get_custom_entity_types
*****************************

Glue.Client.batch_get_custom_entity_types(**kwargs)

   Retrieves the details for the custom patterns specified by a list
   of names.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_custom_entity_types(
          Names=[
              'string',
          ]
      )

   Parameters:
      **Names** (*list*) --

      **[REQUIRED]**

      A list of names of the custom patterns that you want to
      retrieve.

      * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'CustomEntityTypes': [
                 {
                     'Name': 'string',
                     'RegexString': 'string',
                     'ContextWords': [
                         'string',
                     ]
                 },
             ],
             'CustomEntityTypesNotFound': [
                 'string',
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **CustomEntityTypes** *(list) --*

          A list of "CustomEntityType" objects representing the custom
          patterns that have been created.

          * *(dict) --*

            An object representing a custom pattern for detecting
            sensitive data across the columns and rows of your
            structured data.

            * **Name** *(string) --*

              A name for the custom pattern that allows it to be
              retrieved or deleted later. This name must be unique per
              Amazon Web Services account.

            * **RegexString** *(string) --*

              A regular expression string that is used for detecting
              sensitive data in a custom pattern.

            * **ContextWords** *(list) --*

              A list of context words. If none of these context words
              are found within the vicinity of the regular expression
              the data will not be detected as sensitive data.

              If no context words are passed only a regular expression
              is checked.

              * *(string) --*

        * **CustomEntityTypesNotFound** *(list) --*

          A list of the names of custom patterns that were not found.

          * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_paginator


get_paginator
*************

Glue.Client.get_paginator(operation_name)

   Create a paginator for an operation.

   Parameters:
      **operation_name** (*string*) -- The operation name.  This is
      the same name as the method name on the client.  For example, if
      the method name is "create_foo", and you'd normally invoke the
      operation as "client.create_foo(**kwargs)", if the "create_foo"
      operation can be paginated, you can use the call
      "client.get_paginator("create_foo")".

   Raises:
      **OperationNotPageableError** -- Raised if the operation is not
      pageable.  You can use the "client.can_paginate" method to check
      if an operation is pageable.

   Return type:
      "botocore.paginate.Paginator"

   Returns:
      A paginator object.
Glue / Client / get_dataflow_graph


get_dataflow_graph
******************

Glue.Client.get_dataflow_graph(**kwargs)

   Transforms a Python script into a directed acyclic graph (DAG).

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_dataflow_graph(
          PythonScript='string'
      )

   Parameters:
      **PythonScript** (*string*) -- The Python script to transform.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'DagNodes': [
                 {
                     'Id': 'string',
                     'NodeType': 'string',
                     'Args': [
                         {
                             'Name': 'string',
                             'Value': 'string',
                             'Param': True|False
                         },
                     ],
                     'LineNumber': 123
                 },
             ],
             'DagEdges': [
                 {
                     'Source': 'string',
                     'Target': 'string',
                     'TargetParameter': 'string'
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **DagNodes** *(list) --*

          A list of the nodes in the resulting DAG.

          * *(dict) --*

            Represents a node in a directed acyclic graph (DAG)

            * **Id** *(string) --*

              A node identifier that is unique within the node's
              graph.

            * **NodeType** *(string) --*

              The type of node that this is.

            * **Args** *(list) --*

              Properties of the node, in the form of name-value pairs.

              * *(dict) --*

                An argument or property of a node.

                * **Name** *(string) --*

                  The name of the argument or property.

                * **Value** *(string) --*

                  The value of the argument or property.

                * **Param** *(boolean) --*

                  True if the value is used as a parameter.

            * **LineNumber** *(integer) --*

              The line number of the node.

        * **DagEdges** *(list) --*

          A list of the edges in the resulting DAG.

          * *(dict) --*

            Represents a directional edge in a directed acyclic graph
            (DAG).

            * **Source** *(string) --*

              The ID of the node at which the edge starts.

            * **Target** *(string) --*

              The ID of the node at which the edge ends.

            * **TargetParameter** *(string) --*

              The target of the edge.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / create_catalog


create_catalog
**************

Glue.Client.create_catalog(**kwargs)

   Creates a new catalog in the Glue Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_catalog(
          Name='string',
          CatalogInput={
              'Description': 'string',
              'FederatedCatalog': {
                  'Identifier': 'string',
                  'ConnectionName': 'string',
                  'ConnectionType': 'string'
              },
              'Parameters': {
                  'string': 'string'
              },
              'TargetRedshiftCatalog': {
                  'CatalogArn': 'string'
              },
              'CatalogProperties': {
                  'DataLakeAccessProperties': {
                      'DataLakeAccess': True|False,
                      'DataTransferRole': 'string',
                      'KmsKey': 'string',
                      'CatalogType': 'string'
                  },
                  'IcebergOptimizationProperties': {
                      'RoleArn': 'string',
                      'Compaction': {
                          'string': 'string'
                      },
                      'Retention': {
                          'string': 'string'
                      },
                      'OrphanFileDeletion': {
                          'string': 'string'
                      }
                  },
                  'CustomProperties': {
                      'string': 'string'
                  }
              },
              'CreateTableDefaultPermissions': [
                  {
                      'Principal': {
                          'DataLakePrincipalIdentifier': 'string'
                      },
                      'Permissions': [
                          'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                      ]
                  },
              ],
              'CreateDatabaseDefaultPermissions': [
                  {
                      'Principal': {
                          'DataLakePrincipalIdentifier': 'string'
                      },
                      'Permissions': [
                          'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                      ]
                  },
              ],
              'AllowFullTableExternalDataAccess': 'True'|'False'
          },
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the catalog to create.

      * **CatalogInput** (*dict*) --

        **[REQUIRED]**

        A "CatalogInput" object that defines the metadata for the
        catalog.

        * **Description** *(string) --*

          Description string, not more than 2048 bytes long, matching
          the URI address multi-line string pattern. A description of
          the catalog.

        * **FederatedCatalog** *(dict) --*

          A "FederatedCatalog" object. A "FederatedCatalog" structure
          that references an entity outside the Glue Data Catalog, for
          example a Redshift database.

          * **Identifier** *(string) --*

            A unique identifier for the federated catalog.

          * **ConnectionName** *(string) --*

            The name of the connection to an external data source, for
            example a Redshift-federated catalog.

          * **ConnectionType** *(string) --*

            The type of connection used to access the federated
            catalog, specifying the protocol or method for connection
            to the external data source.

        * **Parameters** *(dict) --*

          A map array of key-value pairs that define the parameters
          and properties of the catalog.

          * *(string) --*

            * *(string) --*

        * **TargetRedshiftCatalog** *(dict) --*

          A "TargetRedshiftCatalog" object that describes a target
          catalog for resource linking.

          * **CatalogArn** *(string) --* **[REQUIRED]**

            The Amazon Resource Name (ARN) of the catalog resource.

        * **CatalogProperties** *(dict) --*

          A "CatalogProperties" object that specifies data lake access
          properties and other custom properties.

          * **DataLakeAccessProperties** *(dict) --*

            A "DataLakeAccessProperties" object that specifies
            properties to configure data lake access for your catalog
            resource in the Glue Data Catalog.

            * **DataLakeAccess** *(boolean) --*

              Turns on or off data lake access for Apache Spark
              applications that access Amazon Redshift databases in
              the Data Catalog from any non-Redshift engine, such as
              Amazon Athena, Amazon EMR, or Glue ETL.

            * **DataTransferRole** *(string) --*

              A role that will be assumed by Glue for transferring
              data into/out of the staging bucket during a query.

            * **KmsKey** *(string) --*

              An encryption key that will be used for the staging
              bucket that will be created along with the catalog.

            * **CatalogType** *(string) --*

              Specifies a federated catalog type for the native
              catalog resource. The currently supported type is
              "aws:redshift".

          * **IcebergOptimizationProperties** *(dict) --*

            A structure that specifies Iceberg table optimization
            properties for the catalog. This includes configuration
            for compaction, retention, and orphan file deletion
            operations that can be applied to Iceberg tables in this
            catalog.

            * **RoleArn** *(string) --*

              The Amazon Resource Name (ARN) of the IAM role that will
              be assumed to perform Iceberg table optimization
              operations.

            * **Compaction** *(dict) --*

              A map of key-value pairs that specify configuration
              parameters for Iceberg table compaction operations,
              which optimize the layout of data files to improve query
              performance.

              * *(string) --*

                * *(string) --*

            * **Retention** *(dict) --*

              A map of key-value pairs that specify configuration
              parameters for Iceberg table retention operations, which
              manage the lifecycle of table snapshots to control
              storage costs.

              * *(string) --*

                * *(string) --*

            * **OrphanFileDeletion** *(dict) --*

              A map of key-value pairs that specify configuration
              parameters for Iceberg orphan file deletion operations,
              which identify and remove files that are no longer
              referenced by the table metadata.

              * *(string) --*

                * *(string) --*

          * **CustomProperties** *(dict) --*

            Additional key-value properties for the catalog, such as
            column statistics optimizations.

            * *(string) --*

              * *(string) --*

        * **CreateTableDefaultPermissions** *(list) --*

          An array of "PrincipalPermissions" objects. Creates a set of
          default permissions on the table(s) for principals. Used by
          Amazon Web Services Lake Formation. Typically should be
          explicitly set as an empty list.

          * *(dict) --*

            Permissions granted to a principal.

            * **Principal** *(dict) --*

              The principal who is granted permissions.

              * **DataLakePrincipalIdentifier** *(string) --*

                An identifier for the Lake Formation principal.

            * **Permissions** *(list) --*

              The permissions that are granted to the principal.

              * *(string) --*

        * **CreateDatabaseDefaultPermissions** *(list) --*

          An array of "PrincipalPermissions" objects. Creates a set of
          default permissions on the database(s) for principals. Used
          by Amazon Web Services Lake Formation. Typically should be
          explicitly set as an empty list.

          * *(dict) --*

            Permissions granted to a principal.

            * **Principal** *(dict) --*

              The principal who is granted permissions.

              * **DataLakePrincipalIdentifier** *(string) --*

                An identifier for the Lake Formation principal.

            * **Permissions** *(list) --*

              The permissions that are granted to the principal.

              * *(string) --*

        * **AllowFullTableExternalDataAccess** *(string) --*

          Allows third-party engines to access data in Amazon S3
          locations that are registered with Lake Formation.

      * **Tags** (*dict*) --

        A map array of key-value pairs, not more than 50 pairs. Each
        key is a UTF-8 string, not less than 1 or more than 128 bytes
        long. Each value is a UTF-8 string, not more than 256 bytes
        long. The tags you assign to the catalog.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.FederatedResourceAlreadyExistsException"

   * "Glue.Client.exceptions.FederationSourceException"
Glue / Client / delete_classifier


delete_classifier
*****************

Glue.Client.delete_classifier(**kwargs)

   Removes a classifier from the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_classifier(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      Name of the classifier to remove.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_usage_profile


get_usage_profile
*****************

Glue.Client.get_usage_profile(**kwargs)

   Retrieves information about the specified Glue usage profile.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_usage_profile(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the usage profile to retrieve.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string',
             'Description': 'string',
             'Configuration': {
                 'SessionConfiguration': {
                     'string': {
                         'DefaultValue': 'string',
                         'AllowedValues': [
                             'string',
                         ],
                         'MinValue': 'string',
                         'MaxValue': 'string'
                     }
                 },
                 'JobConfiguration': {
                     'string': {
                         'DefaultValue': 'string',
                         'AllowedValues': [
                             'string',
                         ],
                         'MinValue': 'string',
                         'MaxValue': 'string'
                     }
                 }
             },
             'CreatedOn': datetime(2015, 1, 1),
             'LastModifiedOn': datetime(2015, 1, 1)
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the usage profile.

        * **Description** *(string) --*

          A description of the usage profile.

        * **Configuration** *(dict) --*

          A "ProfileConfiguration" object specifying the job and
          session values for the profile.

          * **SessionConfiguration** *(dict) --*

            A key-value map of configuration parameters for Glue
            sessions.

            * *(string) --*

              * *(dict) --*

                Specifies the values that an admin sets for each job
                or session parameter configured in a Glue usage
                profile.

                * **DefaultValue** *(string) --*

                  A default value for the parameter.

                * **AllowedValues** *(list) --*

                  A list of allowed values for the parameter.

                  * *(string) --*

                * **MinValue** *(string) --*

                  A minimum allowed value for the parameter.

                * **MaxValue** *(string) --*

                  A maximum allowed value for the parameter.

          * **JobConfiguration** *(dict) --*

            A key-value map of configuration parameters for Glue jobs.

            * *(string) --*

              * *(dict) --*

                Specifies the values that an admin sets for each job
                or session parameter configured in a Glue usage
                profile.

                * **DefaultValue** *(string) --*

                  A default value for the parameter.

                * **AllowedValues** *(list) --*

                  A list of allowed values for the parameter.

                  * *(string) --*

                * **MinValue** *(string) --*

                  A minimum allowed value for the parameter.

                * **MaxValue** *(string) --*

                  A maximum allowed value for the parameter.

        * **CreatedOn** *(datetime) --*

          The date and time when the usage profile was created.

        * **LastModifiedOn** *(datetime) --*

          The date and time when the usage profile was last modified.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.OperationNotSupportedException"
Glue / Client / cancel_data_quality_ruleset_evaluation_run


cancel_data_quality_ruleset_evaluation_run
******************************************

Glue.Client.cancel_data_quality_ruleset_evaluation_run(**kwargs)

   Cancels a run where a ruleset is being evaluated against a data
   source.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.cancel_data_quality_ruleset_evaluation_run(
          RunId='string'
      )

   Parameters:
      **RunId** (*string*) --

      **[REQUIRED]**

      The unique run identifier associated with this run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / create_registry


create_registry
***************

Glue.Client.create_registry(**kwargs)

   Creates a new registry which may be used to hold a collection of
   schemas.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_registry(
          RegistryName='string',
          Description='string',
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **RegistryName** (*string*) --

        **[REQUIRED]**

        Name of the registry to be created of max length of 255, and
        may only contain letters, numbers, hyphen, underscore, dollar
        sign, or hash mark. No whitespace.

      * **Description** (*string*) -- A description of the registry.
        If description is not provided, there will not be any default
        value for this.

      * **Tags** (*dict*) --

        Amazon Web Services tags that contain a key value pair and may
        be searched by console, command line, or API.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RegistryArn': 'string',
             'RegistryName': 'string',
             'Description': 'string',
             'Tags': {
                 'string': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **RegistryArn** *(string) --*

          The Amazon Resource Name (ARN) of the newly created
          registry.

        * **RegistryName** *(string) --*

          The name of the registry.

        * **Description** *(string) --*

          A description of the registry.

        * **Tags** *(dict) --*

          The tags for the registry.

          * *(string) --*

            * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / start_workflow_run


start_workflow_run
******************

Glue.Client.start_workflow_run(**kwargs)

   Starts a new run of the specified workflow.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_workflow_run(
          Name='string',
          RunProperties={
              'string': 'string'
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the workflow to start.

      * **RunProperties** (*dict*) --

        The workflow run properties for the new workflow run.

        Run properties may be logged. Do not pass plaintext secrets as
        properties. Retrieve secrets from a Glue Connection, Amazon
        Web Services Secrets Manager or other secret management
        mechanism if you intend to use them within the workflow run.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **RunId** *(string) --*

          An Id for the new run.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentRunsExceededException"
Glue / Client / get_security_configuration


get_security_configuration
**************************

Glue.Client.get_security_configuration(**kwargs)

   Retrieves a specified security configuration.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_security_configuration(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the security configuration to retrieve.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SecurityConfiguration': {
                 'Name': 'string',
                 'CreatedTimeStamp': datetime(2015, 1, 1),
                 'EncryptionConfiguration': {
                     'S3Encryption': [
                         {
                             'S3EncryptionMode': 'DISABLED'|'SSE-KMS'|'SSE-S3',
                             'KmsKeyArn': 'string'
                         },
                     ],
                     'CloudWatchEncryption': {
                         'CloudWatchEncryptionMode': 'DISABLED'|'SSE-KMS',
                         'KmsKeyArn': 'string'
                     },
                     'JobBookmarksEncryption': {
                         'JobBookmarksEncryptionMode': 'DISABLED'|'CSE-KMS',
                         'KmsKeyArn': 'string'
                     },
                     'DataQualityEncryption': {
                         'DataQualityEncryptionMode': 'DISABLED'|'SSE-KMS',
                         'KmsKeyArn': 'string'
                     }
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **SecurityConfiguration** *(dict) --*

          The requested security configuration.

          * **Name** *(string) --*

            The name of the security configuration.

          * **CreatedTimeStamp** *(datetime) --*

            The time at which this security configuration was created.

          * **EncryptionConfiguration** *(dict) --*

            The encryption configuration associated with this security
            configuration.

            * **S3Encryption** *(list) --*

              The encryption configuration for Amazon Simple Storage
              Service (Amazon S3) data.

              * *(dict) --*

                Specifies how Amazon Simple Storage Service (Amazon
                S3) data should be encrypted.

                * **S3EncryptionMode** *(string) --*

                  The encryption mode to use for Amazon S3 data.

                * **KmsKeyArn** *(string) --*

                  The Amazon Resource Name (ARN) of the KMS key to be
                  used to encrypt the data.

            * **CloudWatchEncryption** *(dict) --*

              The encryption configuration for Amazon CloudWatch.

              * **CloudWatchEncryptionMode** *(string) --*

                The encryption mode to use for CloudWatch data.

              * **KmsKeyArn** *(string) --*

                The Amazon Resource Name (ARN) of the KMS key to be
                used to encrypt the data.

            * **JobBookmarksEncryption** *(dict) --*

              The encryption configuration for job bookmarks.

              * **JobBookmarksEncryptionMode** *(string) --*

                The encryption mode to use for job bookmarks data.

              * **KmsKeyArn** *(string) --*

                The Amazon Resource Name (ARN) of the KMS key to be
                used to encrypt the data.

            * **DataQualityEncryption** *(dict) --*

              The encryption configuration for Glue Data Quality
              assets.

              * **DataQualityEncryptionMode** *(string) --*

                The encryption mode to use for encrypting Data Quality
                assets. These assets include data quality rulesets,
                results, statistics, anomaly detection models and
                observations.

                Valid values are "SSEKMS" for encryption using a
                customer-managed KMS key, or "DISABLED".

              * **KmsKeyArn** *(string) --*

                The Amazon Resource Name (ARN) of the KMS key to be
                used to encrypt the data.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_job


get_job
*******

Glue.Client.get_job(**kwargs)

   Retrieves an existing job definition.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_job(
          JobName='string'
      )

   Parameters:
      **JobName** (*string*) --

      **[REQUIRED]**

      The name of the job definition to retrieve.

   Return type:
      dict

   Returns:
      **Response Syntax**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

      **Response Structure**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / import_catalog_to_glue


import_catalog_to_glue
**********************

Glue.Client.import_catalog_to_glue(**kwargs)

   Imports an existing Amazon Athena Data Catalog to Glue.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.import_catalog_to_glue(
          CatalogId='string'
      )

   Parameters:
      **CatalogId** (*string*) -- The ID of the catalog to import.
      Currently, this should be the Amazon Web Services account ID.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / delete_data_quality_ruleset


delete_data_quality_ruleset
***************************

Glue.Client.delete_data_quality_ruleset(**kwargs)

   Deletes a data quality ruleset.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_data_quality_ruleset(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      A name for the data quality ruleset.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / get_unfiltered_partition_metadata


get_unfiltered_partition_metadata
*********************************

Glue.Client.get_unfiltered_partition_metadata(**kwargs)

   Retrieves partition metadata from the Data Catalog that contains
   unfiltered metadata.

   For IAM authorization, the public IAM action associated with this
   API is "glue:GetPartition".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_unfiltered_partition_metadata(
          Region='string',
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionValues=[
              'string',
          ],
          AuditContext={
              'AdditionalAuditContext': 'string',
              'RequestedColumns': [
                  'string',
              ],
              'AllColumnsRequested': True|False
          },
          SupportedPermissionTypes=[
              'COLUMN_PERMISSION'|'CELL_FILTER_PERMISSION'|'NESTED_PERMISSION'|'NESTED_CELL_PERMISSION',
          ],
          QuerySessionContext={
              'QueryId': 'string',
              'QueryStartTime': datetime(2015, 1, 1),
              'ClusterId': 'string',
              'QueryAuthorizationId': 'string',
              'AdditionalContext': {
                  'string': 'string'
              }
          }
      )

   Parameters:
      * **Region** (*string*) -- Specified only if the base tables
        belong to a different Amazon Web Services Region.

      * **CatalogId** (*string*) --

        **[REQUIRED]**

        The catalog ID where the partition resides.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        (Required) Specifies the name of a database that contains the
        partition.

      * **TableName** (*string*) --

        **[REQUIRED]**

        (Required) Specifies the name of a table that contains the
        partition.

      * **PartitionValues** (*list*) --

        **[REQUIRED]**

        (Required) A list of partition key values.

        * *(string) --*

      * **AuditContext** (*dict*) --

        A structure containing Lake Formation audit context
        information.

        * **AdditionalAuditContext** *(string) --*

          A string containing the additional audit context
          information.

        * **RequestedColumns** *(list) --*

          The requested columns for audit.

          * *(string) --*

        * **AllColumnsRequested** *(boolean) --*

          All columns request for audit.

      * **SupportedPermissionTypes** (*list*) --

        **[REQUIRED]**

        (Required) A list of supported permission types.

        * *(string) --*

      * **QuerySessionContext** (*dict*) --

        A structure used as a protocol between query engines and Lake
        Formation or Glue. Contains both a Lake Formation generated
        authorization identifier and information from the request's
        authorization context.

        * **QueryId** *(string) --*

          A unique identifier generated by the query engine for the
          query.

        * **QueryStartTime** *(datetime) --*

          A timestamp provided by the query engine for when the query
          started.

        * **ClusterId** *(string) --*

          An identifier string for the consumer cluster.

        * **QueryAuthorizationId** *(string) --*

          A cryptographically generated query identifier generated by
          Glue or Lake Formation.

        * **AdditionalContext** *(dict) --*

          An opaque string-string map passed by the query engine.

          * *(string) --*

            * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Partition': {
                 'Values': [
                     'string',
                 ],
                 'DatabaseName': 'string',
                 'TableName': 'string',
                 'CreationTime': datetime(2015, 1, 1),
                 'LastAccessTime': datetime(2015, 1, 1),
                 'StorageDescriptor': {
                     'Columns': [
                         {
                             'Name': 'string',
                             'Type': 'string',
                             'Comment': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                     ],
                     'Location': 'string',
                     'AdditionalLocations': [
                         'string',
                     ],
                     'InputFormat': 'string',
                     'OutputFormat': 'string',
                     'Compressed': True|False,
                     'NumberOfBuckets': 123,
                     'SerdeInfo': {
                         'Name': 'string',
                         'SerializationLibrary': 'string',
                         'Parameters': {
                             'string': 'string'
                         }
                     },
                     'BucketColumns': [
                         'string',
                     ],
                     'SortColumns': [
                         {
                             'Column': 'string',
                             'SortOrder': 123
                         },
                     ],
                     'Parameters': {
                         'string': 'string'
                     },
                     'SkewedInfo': {
                         'SkewedColumnNames': [
                             'string',
                         ],
                         'SkewedColumnValues': [
                             'string',
                         ],
                         'SkewedColumnValueLocationMaps': {
                             'string': 'string'
                         }
                     },
                     'StoredAsSubDirectories': True|False,
                     'SchemaReference': {
                         'SchemaId': {
                             'SchemaArn': 'string',
                             'SchemaName': 'string',
                             'RegistryName': 'string'
                         },
                         'SchemaVersionId': 'string',
                         'SchemaVersionNumber': 123
                     }
                 },
                 'Parameters': {
                     'string': 'string'
                 },
                 'LastAnalyzedTime': datetime(2015, 1, 1),
                 'CatalogId': 'string'
             },
             'AuthorizedColumns': [
                 'string',
             ],
             'IsRegisteredWithLakeFormation': True|False
         }

      **Response Structure**

      * *(dict) --*

        * **Partition** *(dict) --*

          A Partition object containing the partition metadata.

          * **Values** *(list) --*

            The values of the partition.

            * *(string) --*

          * **DatabaseName** *(string) --*

            The name of the catalog database in which to create the
            partition.

          * **TableName** *(string) --*

            The name of the database table in which to create the
            partition.

          * **CreationTime** *(datetime) --*

            The time at which the partition was created.

          * **LastAccessTime** *(datetime) --*

            The last time at which the partition was accessed.

          * **StorageDescriptor** *(dict) --*

            Provides information about the physical location where the
            partition is stored.

            * **Columns** *(list) --*

              A list of the "Columns" in the table.

              * *(dict) --*

                A column in a "Table".

                * **Name** *(string) --*

                  The name of the "Column".

                * **Type** *(string) --*

                  The data type of the "Column".

                * **Comment** *(string) --*

                  A free-form text comment.

                * **Parameters** *(dict) --*

                  These key-value pairs define properties associated
                  with the column.

                  * *(string) --*

                    * *(string) --*

            * **Location** *(string) --*

              The physical location of the table. By default, this
              takes the form of the warehouse location, followed by
              the database location in the warehouse, followed by the
              table name.

            * **AdditionalLocations** *(list) --*

              A list of locations that point to the path where a Delta
              table is located.

              * *(string) --*

            * **InputFormat** *(string) --*

              The input format: "SequenceFileInputFormat" (binary), or
              "TextInputFormat", or a custom format.

            * **OutputFormat** *(string) --*

              The output format: "SequenceFileOutputFormat" (binary),
              or "IgnoreKeyTextOutputFormat", or a custom format.

            * **Compressed** *(boolean) --*

              "True" if the data in the table is compressed, or
              "False" if not.

            * **NumberOfBuckets** *(integer) --*

              Must be specified if the table contains any dimension
              columns.

            * **SerdeInfo** *(dict) --*

              The serialization/deserialization (SerDe) information.

              * **Name** *(string) --*

                Name of the SerDe.

              * **SerializationLibrary** *(string) --*

                Usually the class that implements the SerDe. An
                example is "org.apache.hadoop.hive.serde2.columnar.Co
                lumnarSerDe".

              * **Parameters** *(dict) --*

                These key-value pairs define initialization parameters
                for the SerDe.

                * *(string) --*

                  * *(string) --*

            * **BucketColumns** *(list) --*

              A list of reducer grouping columns, clustering columns,
              and bucketing columns in the table.

              * *(string) --*

            * **SortColumns** *(list) --*

              A list specifying the sort order of each bucket in the
              table.

              * *(dict) --*

                Specifies the sort order of a sorted column.

                * **Column** *(string) --*

                  The name of the column.

                * **SortOrder** *(integer) --*

                  Indicates that the column is sorted in ascending
                  order ( "== 1"), or in descending order ( "==0").

            * **Parameters** *(dict) --*

              The user-supplied properties in key-value form.

              * *(string) --*

                * *(string) --*

            * **SkewedInfo** *(dict) --*

              The information about values that appear frequently in a
              column (skewed values).

              * **SkewedColumnNames** *(list) --*

                A list of names of columns that contain skewed values.

                * *(string) --*

              * **SkewedColumnValues** *(list) --*

                A list of values that appear so frequently as to be
                considered skewed.

                * *(string) --*

              * **SkewedColumnValueLocationMaps** *(dict) --*

                A mapping of skewed values to the columns that contain
                them.

                * *(string) --*

                  * *(string) --*

            * **StoredAsSubDirectories** *(boolean) --*

              "True" if the table data is stored in subdirectories, or
              "False" if not.

            * **SchemaReference** *(dict) --*

              An object that references a schema stored in the Glue
              Schema Registry.

              When creating a table, you can pass an empty list of
              columns for the schema, and instead use a schema
              reference.

              * **SchemaId** *(dict) --*

                A structure that contains schema identity fields.
                Either this or the "SchemaVersionId" has to be
                provided.

                * **SchemaArn** *(string) --*

                  The Amazon Resource Name (ARN) of the schema. One of
                  "SchemaArn" or "SchemaName" has to be provided.

                * **SchemaName** *(string) --*

                  The name of the schema. One of "SchemaArn" or
                  "SchemaName" has to be provided.

                * **RegistryName** *(string) --*

                  The name of the schema registry that contains the
                  schema.

              * **SchemaVersionId** *(string) --*

                The unique ID assigned to a version of the schema.
                Either this or the "SchemaId" has to be provided.

              * **SchemaVersionNumber** *(integer) --*

                The version number of the schema.

          * **Parameters** *(dict) --*

            These key-value pairs define partition parameters.

            * *(string) --*

              * *(string) --*

          * **LastAnalyzedTime** *(datetime) --*

            The last time at which column statistics were computed for
            this partition.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the partition resides.

        * **AuthorizedColumns** *(list) --*

          A list of column names that the user has been granted access
          to.

          * *(string) --*

        * **IsRegisteredWithLakeFormation** *(boolean) --*

          A Boolean value that indicates whether the partition
          location is registered with Lake Formation.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.PermissionTypeMismatchException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / delete_catalog


delete_catalog
**************

Glue.Client.delete_catalog(**kwargs)

   Removes the specified catalog from the Glue Data Catalog.

   After completing this operation, you no longer have access to the
   databases, tables (and all table versions and partitions that might
   belong to the tables) and the user-defined functions in the deleted
   catalog. Glue deletes these "orphaned" resources asynchronously in
   a timely manner, at the discretion of the service.

   To ensure the immediate deletion of all related resources before
   calling the "DeleteCatalog" operation, use "DeleteTableVersion" (or
   "BatchDeleteTableVersion"), "DeletePartition" (or
   "BatchDeletePartition"), "DeleteTable" (or "BatchDeleteTable"),
   "DeleteUserDefinedFunction" and "DeleteDatabase" to delete any
   resources that belong to the catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_catalog(
          CatalogId='string'
      )

   Parameters:
      **CatalogId** (*string*) --

      **[REQUIRED]**

      The ID of the catalog.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.FederationSourceException"
Glue / Client / update_table


update_table
************

Glue.Client.update_table(**kwargs)

   Updates a metadata table in the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_table(
          CatalogId='string',
          DatabaseName='string',
          Name='string',
          TableInput={
              'Name': 'string',
              'Description': 'string',
              'Owner': 'string',
              'LastAccessTime': datetime(2015, 1, 1),
              'LastAnalyzedTime': datetime(2015, 1, 1),
              'Retention': 123,
              'StorageDescriptor': {
                  'Columns': [
                      {
                          'Name': 'string',
                          'Type': 'string',
                          'Comment': 'string',
                          'Parameters': {
                              'string': 'string'
                          }
                      },
                  ],
                  'Location': 'string',
                  'AdditionalLocations': [
                      'string',
                  ],
                  'InputFormat': 'string',
                  'OutputFormat': 'string',
                  'Compressed': True|False,
                  'NumberOfBuckets': 123,
                  'SerdeInfo': {
                      'Name': 'string',
                      'SerializationLibrary': 'string',
                      'Parameters': {
                          'string': 'string'
                      }
                  },
                  'BucketColumns': [
                      'string',
                  ],
                  'SortColumns': [
                      {
                          'Column': 'string',
                          'SortOrder': 123
                      },
                  ],
                  'Parameters': {
                      'string': 'string'
                  },
                  'SkewedInfo': {
                      'SkewedColumnNames': [
                          'string',
                      ],
                      'SkewedColumnValues': [
                          'string',
                      ],
                      'SkewedColumnValueLocationMaps': {
                          'string': 'string'
                      }
                  },
                  'StoredAsSubDirectories': True|False,
                  'SchemaReference': {
                      'SchemaId': {
                          'SchemaArn': 'string',
                          'SchemaName': 'string',
                          'RegistryName': 'string'
                      },
                      'SchemaVersionId': 'string',
                      'SchemaVersionNumber': 123
                  }
              },
              'PartitionKeys': [
                  {
                      'Name': 'string',
                      'Type': 'string',
                      'Comment': 'string',
                      'Parameters': {
                          'string': 'string'
                      }
                  },
              ],
              'ViewOriginalText': 'string',
              'ViewExpandedText': 'string',
              'TableType': 'string',
              'Parameters': {
                  'string': 'string'
              },
              'TargetTable': {
                  'CatalogId': 'string',
                  'DatabaseName': 'string',
                  'Name': 'string',
                  'Region': 'string'
              },
              'ViewDefinition': {
                  'IsProtected': True|False,
                  'Definer': 'string',
                  'Representations': [
                      {
                          'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                          'DialectVersion': 'string',
                          'ViewOriginalText': 'string',
                          'ValidationConnection': 'string',
                          'ViewExpandedText': 'string'
                      },
                  ],
                  'SubObjects': [
                      'string',
                  ]
              }
          },
          SkipArchive=True|False,
          TransactionId='string',
          VersionId='string',
          ViewUpdateAction='ADD'|'REPLACE'|'ADD_OR_REPLACE'|'DROP',
          Force=True|False,
          UpdateOpenTableFormatInput={
              'UpdateIcebergInput': {
                  'UpdateIcebergTableInput': {
                      'Updates': [
                          {
                              'Schema': {
                                  'SchemaId': 123,
                                  'IdentifierFieldIds': [
                                      123,
                                  ],
                                  'Type': 'struct',
                                  'Fields': [
                                      {
                                          'Id': 123,
                                          'Name': 'string',
                                          'Type': {...}|[...]|123|123.4|'string'|True|None,
                                          'Required': True|False,
                                          'Doc': 'string'
                                      },
                                  ]
                              },
                              'PartitionSpec': {
                                  'Fields': [
                                      {
                                          'SourceId': 123,
                                          'Transform': 'string',
                                          'Name': 'string',
                                          'FieldId': 123
                                      },
                                  ],
                                  'SpecId': 123
                              },
                              'SortOrder': {
                                  'OrderId': 123,
                                  'Fields': [
                                      {
                                          'SourceId': 123,
                                          'Transform': 'string',
                                          'Direction': 'asc'|'desc',
                                          'NullOrder': 'nulls-first'|'nulls-last'
                                      },
                                  ]
                              },
                              'Location': 'string',
                              'Properties': {
                                  'string': 'string'
                              }
                          },
                      ]
                  }
              }
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the table resides. If none is provided, the Amazon Web
        Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database in which the table resides.
        For Hive compatibility, this name is entirely lowercase.

      * **Name** (*string*) -- The unique identifier for the table
        within the specified database that will be created in the Glue
        Data Catalog.

      * **TableInput** (*dict*) --

        An updated "TableInput" object to define the metadata table in
        the catalog.

        * **Name** *(string) --* **[REQUIRED]**

          The table name. For Hive compatibility, this is folded to
          lowercase when it is stored.

        * **Description** *(string) --*

          A description of the table.

        * **Owner** *(string) --*

          The table owner. Included for Apache Hive compatibility. Not
          used in the normal course of Glue operations.

        * **LastAccessTime** *(datetime) --*

          The last time that the table was accessed.

        * **LastAnalyzedTime** *(datetime) --*

          The last time that column statistics were computed for this
          table.

        * **Retention** *(integer) --*

          The retention time for this table.

        * **StorageDescriptor** *(dict) --*

          A storage descriptor containing information about the
          physical storage of this table.

          * **Columns** *(list) --*

            A list of the "Columns" in the table.

            * *(dict) --*

              A column in a "Table".

              * **Name** *(string) --* **[REQUIRED]**

                The name of the "Column".

              * **Type** *(string) --*

                The data type of the "Column".

              * **Comment** *(string) --*

                A free-form text comment.

              * **Parameters** *(dict) --*

                These key-value pairs define properties associated
                with the column.

                * *(string) --*

                  * *(string) --*

          * **Location** *(string) --*

            The physical location of the table. By default, this takes
            the form of the warehouse location, followed by the
            database location in the warehouse, followed by the table
            name.

          * **AdditionalLocations** *(list) --*

            A list of locations that point to the path where a Delta
            table is located.

            * *(string) --*

          * **InputFormat** *(string) --*

            The input format: "SequenceFileInputFormat" (binary), or
            "TextInputFormat", or a custom format.

          * **OutputFormat** *(string) --*

            The output format: "SequenceFileOutputFormat" (binary), or
            "IgnoreKeyTextOutputFormat", or a custom format.

          * **Compressed** *(boolean) --*

            "True" if the data in the table is compressed, or "False"
            if not.

          * **NumberOfBuckets** *(integer) --*

            Must be specified if the table contains any dimension
            columns.

          * **SerdeInfo** *(dict) --*

            The serialization/deserialization (SerDe) information.

            * **Name** *(string) --*

              Name of the SerDe.

            * **SerializationLibrary** *(string) --*

              Usually the class that implements the SerDe. An example
              is
              "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe".

            * **Parameters** *(dict) --*

              These key-value pairs define initialization parameters
              for the SerDe.

              * *(string) --*

                * *(string) --*

          * **BucketColumns** *(list) --*

            A list of reducer grouping columns, clustering columns,
            and bucketing columns in the table.

            * *(string) --*

          * **SortColumns** *(list) --*

            A list specifying the sort order of each bucket in the
            table.

            * *(dict) --*

              Specifies the sort order of a sorted column.

              * **Column** *(string) --* **[REQUIRED]**

                The name of the column.

              * **SortOrder** *(integer) --* **[REQUIRED]**

                Indicates that the column is sorted in ascending order
                ( "== 1"), or in descending order ( "==0").

          * **Parameters** *(dict) --*

            The user-supplied properties in key-value form.

            * *(string) --*

              * *(string) --*

          * **SkewedInfo** *(dict) --*

            The information about values that appear frequently in a
            column (skewed values).

            * **SkewedColumnNames** *(list) --*

              A list of names of columns that contain skewed values.

              * *(string) --*

            * **SkewedColumnValues** *(list) --*

              A list of values that appear so frequently as to be
              considered skewed.

              * *(string) --*

            * **SkewedColumnValueLocationMaps** *(dict) --*

              A mapping of skewed values to the columns that contain
              them.

              * *(string) --*

                * *(string) --*

          * **StoredAsSubDirectories** *(boolean) --*

            "True" if the table data is stored in subdirectories, or
            "False" if not.

          * **SchemaReference** *(dict) --*

            An object that references a schema stored in the Glue
            Schema Registry.

            When creating a table, you can pass an empty list of
            columns for the schema, and instead use a schema
            reference.

            * **SchemaId** *(dict) --*

              A structure that contains schema identity fields. Either
              this or the "SchemaVersionId" has to be provided.

              * **SchemaArn** *(string) --*

                The Amazon Resource Name (ARN) of the schema. One of
                "SchemaArn" or "SchemaName" has to be provided.

              * **SchemaName** *(string) --*

                The name of the schema. One of "SchemaArn" or
                "SchemaName" has to be provided.

              * **RegistryName** *(string) --*

                The name of the schema registry that contains the
                schema.

            * **SchemaVersionId** *(string) --*

              The unique ID assigned to a version of the schema.
              Either this or the "SchemaId" has to be provided.

            * **SchemaVersionNumber** *(integer) --*

              The version number of the schema.

        * **PartitionKeys** *(list) --*

          A list of columns by which the table is partitioned. Only
          primitive types are supported as partition keys.

          When you create a table used by Amazon Athena, and you do
          not specify any "partitionKeys", you must at least set the
          value of "partitionKeys" to an empty list. For example:

          ""PartitionKeys": []"

          * *(dict) --*

            A column in a "Table".

            * **Name** *(string) --* **[REQUIRED]**

              The name of the "Column".

            * **Type** *(string) --*

              The data type of the "Column".

            * **Comment** *(string) --*

              A free-form text comment.

            * **Parameters** *(dict) --*

              These key-value pairs define properties associated with
              the column.

              * *(string) --*

                * *(string) --*

        * **ViewOriginalText** *(string) --*

          Included for Apache Hive compatibility. Not used in the
          normal course of Glue operations. If the table is a
          "VIRTUAL_VIEW", certain Athena configuration encoded in
          base64.

        * **ViewExpandedText** *(string) --*

          Included for Apache Hive compatibility. Not used in the
          normal course of Glue operations.

        * **TableType** *(string) --*

          The type of this table. Glue will create tables with the
          "EXTERNAL_TABLE" type. Other services, such as Athena, may
          create tables with additional table types.

          Glue related table types:

             EXTERNAL_TABLE

          Hive compatible attribute - indicates a non-Hive managed
          table.

             GOVERNED

          Used by Lake Formation. The Glue Data Catalog understands
          "GOVERNED".

        * **Parameters** *(dict) --*

          These key-value pairs define properties associated with the
          table.

          * *(string) --*

            * *(string) --*

        * **TargetTable** *(dict) --*

          A "TableIdentifier" structure that describes a target table
          for resource linking.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the table resides.

          * **DatabaseName** *(string) --*

            The name of the catalog database that contains the target
            table.

          * **Name** *(string) --*

            The name of the target table.

          * **Region** *(string) --*

            Region of the target table.

        * **ViewDefinition** *(dict) --*

          A structure that contains all the information that defines
          the view, including the dialect or dialects for the view,
          and the query.

          * **IsProtected** *(boolean) --*

            You can set this flag as true to instruct the engine not
            to push user-provided operations into the logical plan of
            the view during query planning. However, setting this flag
            does not guarantee that the engine will comply. Refer to
            the engine's documentation to understand the guarantees
            provided, if any.

          * **Definer** *(string) --*

            The definer of a view in SQL.

          * **Representations** *(list) --*

            A list of structures that contains the dialect of the
            view, and the query that defines the view.

            * *(dict) --*

              A structure containing details of a representation to
              update or create a Lake Formation view.

              * **Dialect** *(string) --*

                A parameter that specifies the engine type of a
                specific representation.

              * **DialectVersion** *(string) --*

                A parameter that specifies the version of the engine
                of a specific representation.

              * **ViewOriginalText** *(string) --*

                A string that represents the original SQL query that
                describes the view.

              * **ValidationConnection** *(string) --*

                The name of the connection to be used to validate the
                specific representation of the view.

              * **ViewExpandedText** *(string) --*

                A string that represents the SQL query that describes
                the view with expanded resource ARNs

          * **SubObjects** *(list) --*

            A list of base table ARNs that make up the view.

            * *(string) --*

      * **SkipArchive** (*boolean*) -- By default, "UpdateTable"
        always creates an archived version of the table before
        updating it. However, if "skipArchive" is set to true,
        "UpdateTable" does not create the archived version.

      * **TransactionId** (*string*) -- The transaction ID at which to
        update the table contents.

      * **VersionId** (*string*) -- The version ID at which to update
        the table contents.

      * **ViewUpdateAction** (*string*) -- The operation to be
        performed when updating the view.

      * **Force** (*boolean*) -- A flag that can be set to true to
        ignore matching storage descriptor and subobject matching
        requirements.

      * **UpdateOpenTableFormatInput** (*dict*) --

        Input parameters for updating open table format tables in
        GlueData Catalog, serving as a wrapper for format-specific
        update operations such as Apache Iceberg.

        * **UpdateIcebergInput** *(dict) --*

          Apache Iceberg-specific update parameters that define the
          table modifications to be applied, including schema changes,
          partition specifications, and table properties.

          * **UpdateIcebergTableInput** *(dict) --* **[REQUIRED]**

            The specific update operations to be applied to the
            Iceberg table, containing a list of updates that define
            the new state of the table including schema, partitions,
            and properties.

            * **Updates** *(list) --* **[REQUIRED]**

              The list of table update operations that specify the
              changes to be made to the Iceberg table, including
              schema modifications, partition specifications, and
              table properties.

              * *(dict) --*

                Defines a complete set of updates to be applied to an
                Iceberg table, including schema changes, partitioning
                modifications, sort order adjustments, location
                updates, and property changes.

                * **Schema** *(dict) --* **[REQUIRED]**

                  The updated schema definition for the Iceberg table,
                  specifying any changes to field structure, data
                  types, or schema metadata.

                  * **SchemaId** *(integer) --*

                    The unique identifier for this schema version
                    within the Iceberg table's schema evolution
                    history.

                  * **IdentifierFieldIds** *(list) --*

                    The list of field identifiers that uniquely
                    identify records in the table, used for row-level
                    operations and deduplication.

                    * *(integer) --*

                  * **Type** *(string) --*

                    The root type of the schema structure, typically
                    "struct" for Iceberg table schemas.

                  * **Fields** *(list) --* **[REQUIRED]**

                    The list of field definitions that make up the
                    table schema, including field names, types, and
                    metadata.

                    * *(dict) --*

                      Defines a single field within an Iceberg table
                      schema, including its identifier, name, data
                      type, nullability, and documentation.

                      * **Id** *(integer) --* **[REQUIRED]**

                        The unique identifier assigned to this field
                        within the Iceberg table schema, used for
                        schema evolution and field tracking.

                      * **Name** *(string) --* **[REQUIRED]**

                        The name of the field as it appears in the
                        table schema and query operations.

                      * **Type** (*document*) -- **[REQUIRED]**

                        The data type definition for this field,
                        specifying the structure and format of the
                        data it contains.

                      * **Required** *(boolean) --* **[REQUIRED]**

                        Indicates whether this field is required (non-
                        nullable) or optional (nullable) in the table
                        schema.

                      * **Doc** *(string) --*

                        Optional documentation or description text
                        that provides additional context about the
                        purpose and usage of this field.

                * **PartitionSpec** *(dict) --*

                  The updated partitioning specification that defines
                  how the table data should be reorganized and
                  partitioned.

                  * **Fields** *(list) --* **[REQUIRED]**

                    The list of partition fields that define how the
                    table data should be partitioned, including source
                    fields and their transformations.

                    * *(dict) --*

                      Defines a single partition field within an
                      Iceberg partition specification, including the
                      source field, transformation function, partition
                      name, and unique identifier.

                      * **SourceId** *(integer) --* **[REQUIRED]**

                        The identifier of the source field from the
                        table schema that this partition field is
                        based on.

                      * **Transform** *(string) --* **[REQUIRED]**

                        The transformation function applied to the
                        source field to create the partition, such as
                        identity, bucket, truncate, year, month, day,
                        or hour.

                      * **Name** *(string) --* **[REQUIRED]**

                        The name of the partition field as it will
                        appear in the partitioned table structure.

                      * **FieldId** *(integer) --*

                        The unique identifier assigned to this
                        partition field within the Iceberg table's
                        partition specification.

                  * **SpecId** *(integer) --*

                    The unique identifier for this partition
                    specification within the Iceberg table's metadata
                    history.

                * **SortOrder** *(dict) --*

                  The updated sort order specification that defines
                  how data should be ordered within partitions for
                  optimal query performance.

                  * **OrderId** *(integer) --* **[REQUIRED]**

                    The unique identifier for this sort order
                    specification within the Iceberg table's metadata.

                  * **Fields** *(list) --* **[REQUIRED]**

                    The list of fields and their sort directions that
                    define the ordering criteria for the Iceberg table
                    data.

                    * *(dict) --*

                      Defines a single field within an Iceberg sort
                      order specification, including the source field,
                      transformation, sort direction, and null value
                      ordering.

                      * **SourceId** *(integer) --* **[REQUIRED]**

                        The identifier of the source field from the
                        table schema that this sort field is based on.

                      * **Transform** *(string) --* **[REQUIRED]**

                        The transformation function applied to the
                        source field before sorting, such as identity,
                        bucket, or truncate.

                      * **Direction** *(string) --* **[REQUIRED]**

                        The sort direction for this field, either
                        ascending or descending.

                      * **NullOrder** *(string) --* **[REQUIRED]**

                        The ordering behavior for null values in this
                        field, specifying whether nulls should appear
                        first or last in the sort order.

                * **Location** *(string) --* **[REQUIRED]**

                  The updated S3 location where the Iceberg table data
                  will be stored.

                * **Properties** *(dict) --*

                  Updated key-value pairs of table properties and
                  configuration settings for the Iceberg table.

                  * *(string) --*

                    * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ResourceNotReadyException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"

   * "Glue.Client.exceptions.AlreadyExistsException"
Glue / Client / create_user_defined_function


create_user_defined_function
****************************

Glue.Client.create_user_defined_function(**kwargs)

   Creates a new function definition in the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_user_defined_function(
          CatalogId='string',
          DatabaseName='string',
          FunctionInput={
              'FunctionName': 'string',
              'ClassName': 'string',
              'OwnerName': 'string',
              'OwnerType': 'USER'|'ROLE'|'GROUP',
              'ResourceUris': [
                  {
                      'ResourceType': 'JAR'|'FILE'|'ARCHIVE',
                      'Uri': 'string'
                  },
              ]
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which to create the function. If none is provided, the Amazon
        Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database in which to create the
        function.

      * **FunctionInput** (*dict*) --

        **[REQUIRED]**

        A "FunctionInput" object that defines the function to create
        in the Data Catalog.

        * **FunctionName** *(string) --*

          The name of the function.

        * **ClassName** *(string) --*

          The Java class that contains the function code.

        * **OwnerName** *(string) --*

          The owner of the function.

        * **OwnerType** *(string) --*

          The owner type.

        * **ResourceUris** *(list) --*

          The resource URIs for the function.

          * *(dict) --*

            The URIs for function resources.

            * **ResourceType** *(string) --*

              The type of the resource.

            * **Uri** *(string) --*

              The URI for accessing the resource.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / batch_delete_table_version


batch_delete_table_version
**************************

Glue.Client.batch_delete_table_version(**kwargs)

   Deletes a specified batch of versions of a table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_delete_table_version(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          VersionIds=[
              'string',
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the tables reside. If none is provided, the Amazon Web
        Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The database in the catalog in which the table resides. For
        Hive compatibility, this name is entirely lowercase.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table. For Hive compatibility, this name is
        entirely lowercase.

      * **VersionIds** (*list*) --

        **[REQUIRED]**

        A list of the IDs of versions to be deleted. A "VersionId" is
        a string representation of an integer. Each version is
        incremented by 1.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Errors': [
                 {
                     'TableName': 'string',
                     'VersionId': 'string',
                     'ErrorDetail': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Errors** *(list) --*

          A list of errors encountered while trying to delete the
          specified table versions.

          * *(dict) --*

            An error record for table-version operations.

            * **TableName** *(string) --*

              The name of the table in question.

            * **VersionId** *(string) --*

              The ID value of the version in question. A "VersionID"
              is a string representation of an integer. Each version
              is incremented by 1.

            * **ErrorDetail** *(dict) --*

              The details about the error.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / update_dev_endpoint


update_dev_endpoint
*******************

Glue.Client.update_dev_endpoint(**kwargs)

   Updates a specified development endpoint.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_dev_endpoint(
          EndpointName='string',
          PublicKey='string',
          AddPublicKeys=[
              'string',
          ],
          DeletePublicKeys=[
              'string',
          ],
          CustomLibraries={
              'ExtraPythonLibsS3Path': 'string',
              'ExtraJarsS3Path': 'string'
          },
          UpdateEtlLibraries=True|False,
          DeleteArguments=[
              'string',
          ],
          AddArguments={
              'string': 'string'
          }
      )

   Parameters:
      * **EndpointName** (*string*) --

        **[REQUIRED]**

        The name of the "DevEndpoint" to be updated.

      * **PublicKey** (*string*) -- The public key for the
        "DevEndpoint" to use.

      * **AddPublicKeys** (*list*) --

        The list of public keys for the "DevEndpoint" to use.

        * *(string) --*

      * **DeletePublicKeys** (*list*) --

        The list of public keys to be deleted from the "DevEndpoint".

        * *(string) --*

      * **CustomLibraries** (*dict*) --

        Custom Python or Java libraries to be loaded in the
        "DevEndpoint".

        * **ExtraPythonLibsS3Path** *(string) --*

          The paths to one or more Python libraries in an Amazon
          Simple Storage Service (Amazon S3) bucket that should be
          loaded in your "DevEndpoint". Multiple values must be
          complete paths separated by a comma.

          Note:

            You can only use pure Python libraries with a
            "DevEndpoint". Libraries that rely on C extensions, such
            as the pandas Python data analysis library, are not
            currently supported.

        * **ExtraJarsS3Path** *(string) --*

          The path to one or more Java ".jar" files in an S3 bucket
          that should be loaded in your "DevEndpoint".

          Note:

            You can only use pure Java/Scala libraries with a
            "DevEndpoint".

      * **UpdateEtlLibraries** (*boolean*) -- "True" if the list of
        custom libraries to be loaded in the development endpoint
        needs to be updated, or "False" if otherwise.

      * **DeleteArguments** (*list*) --

        The list of argument keys to be deleted from the map of
        arguments used to configure the "DevEndpoint".

        * *(string) --*

      * **AddArguments** (*dict*) --

        The map of arguments to add the map of arguments used to
        configure the "DevEndpoint".

        Valid arguments are:

        * ""--enable-glue-datacatalog": """

        You can specify a version of Python support for development
        endpoints by using the "Arguments" parameter in the
        "CreateDevEndpoint" or "UpdateDevEndpoint" APIs. If no
        arguments are provided, the version defaults to Python 2.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.ValidationException"
Glue / Client / delete_custom_entity_type


delete_custom_entity_type
*************************

Glue.Client.delete_custom_entity_type(**kwargs)

   Deletes a custom pattern by specifying its name.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_custom_entity_type(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the custom pattern that you want to delete.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the custom pattern you deleted.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / stop_session


stop_session
************

Glue.Client.stop_session(**kwargs)

   Stops the session.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.stop_session(
          Id='string',
          RequestOrigin='string'
      )

   Parameters:
      * **Id** (*string*) --

        **[REQUIRED]**

        The ID of the session to be stopped.

      * **RequestOrigin** (*string*) -- The origin of the request.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Id': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Id** *(string) --*

          Returns the Id of the stopped session.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.IllegalSessionStateException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / get_blueprint


get_blueprint
*************

Glue.Client.get_blueprint(**kwargs)

   Retrieves the details of a blueprint.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_blueprint(
          Name='string',
          IncludeBlueprint=True|False,
          IncludeParameterSpec=True|False
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the blueprint.

      * **IncludeBlueprint** (*boolean*) -- Specifies whether or not
        to include the blueprint in the response.

      * **IncludeParameterSpec** (*boolean*) -- Specifies whether or
        not to include the parameter specification.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Blueprint': {
                 'Name': 'string',
                 'Description': 'string',
                 'CreatedOn': datetime(2015, 1, 1),
                 'LastModifiedOn': datetime(2015, 1, 1),
                 'ParameterSpec': 'string',
                 'BlueprintLocation': 'string',
                 'BlueprintServiceLocation': 'string',
                 'Status': 'CREATING'|'ACTIVE'|'UPDATING'|'FAILED',
                 'ErrorMessage': 'string',
                 'LastActiveDefinition': {
                     'Description': 'string',
                     'LastModifiedOn': datetime(2015, 1, 1),
                     'ParameterSpec': 'string',
                     'BlueprintLocation': 'string',
                     'BlueprintServiceLocation': 'string'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Blueprint** *(dict) --*

          Returns a "Blueprint" object.

          * **Name** *(string) --*

            The name of the blueprint.

          * **Description** *(string) --*

            The description of the blueprint.

          * **CreatedOn** *(datetime) --*

            The date and time the blueprint was registered.

          * **LastModifiedOn** *(datetime) --*

            The date and time the blueprint was last modified.

          * **ParameterSpec** *(string) --*

            A JSON string that indicates the list of parameter
            specifications for the blueprint.

          * **BlueprintLocation** *(string) --*

            Specifies the path in Amazon S3 where the blueprint is
            published.

          * **BlueprintServiceLocation** *(string) --*

            Specifies a path in Amazon S3 where the blueprint is
            copied when you call "CreateBlueprint/UpdateBlueprint" to
            register the blueprint in Glue.

          * **Status** *(string) --*

            The status of the blueprint registration.

            * Creating — The blueprint registration is in progress.

            * Active — The blueprint has been successfully registered.

            * Updating — An update to the blueprint registration is in
              progress.

            * Failed — The blueprint registration failed.

          * **ErrorMessage** *(string) --*

            An error message.

          * **LastActiveDefinition** *(dict) --*

            When there are multiple versions of a blueprint and the
            latest version has some errors, this attribute indicates
            the last successful blueprint definition that is available
            with the service.

            * **Description** *(string) --*

              The description of the blueprint.

            * **LastModifiedOn** *(datetime) --*

              The date and time the blueprint was last modified.

            * **ParameterSpec** *(string) --*

              A JSON string specifying the parameters for the
              blueprint.

            * **BlueprintLocation** *(string) --*

              Specifies a path in Amazon S3 where the blueprint is
              published by the Glue developer.

            * **BlueprintServiceLocation** *(string) --*

              Specifies a path in Amazon S3 where the blueprint is
              copied when you create or update the blueprint.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / can_paginate


can_paginate
************

Glue.Client.can_paginate(operation_name)

   Check if an operation can be paginated.

   Parameters:
      **operation_name** (*string*) -- The operation name.  This is
      the same name as the method name on the client.  For example, if
      the method name is "create_foo", and you'd normally invoke the
      operation as "client.create_foo(**kwargs)", if the "create_foo"
      operation can be paginated, you can use the call
      "client.get_paginator("create_foo")".

   Returns:
      "True" if the operation can be paginated, "False" otherwise.
Glue / Client / update_blueprint


update_blueprint
****************

Glue.Client.update_blueprint(**kwargs)

   Updates a registered blueprint.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_blueprint(
          Name='string',
          Description='string',
          BlueprintLocation='string'
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the blueprint.

      * **Description** (*string*) -- A description of the blueprint.

      * **BlueprintLocation** (*string*) --

        **[REQUIRED]**

        Specifies a path in Amazon S3 where the blueprint is
        published.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          Returns the name of the blueprint that was updated.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.IllegalBlueprintStateException"
Glue / Client / stop_trigger


stop_trigger
************

Glue.Client.stop_trigger(**kwargs)

   Stops a specified trigger.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.stop_trigger(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the trigger to stop.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the trigger that was stopped.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / batch_get_blueprints


batch_get_blueprints
********************

Glue.Client.batch_get_blueprints(**kwargs)

   Retrieves information about a list of blueprints.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_blueprints(
          Names=[
              'string',
          ],
          IncludeBlueprint=True|False,
          IncludeParameterSpec=True|False
      )

   Parameters:
      * **Names** (*list*) --

        **[REQUIRED]**

        A list of blueprint names.

        * *(string) --*

      * **IncludeBlueprint** (*boolean*) -- Specifies whether or not
        to include the blueprint in the response.

      * **IncludeParameterSpec** (*boolean*) -- Specifies whether or
        not to include the parameters, as a JSON string, for the
        blueprint in the response.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Blueprints': [
                 {
                     'Name': 'string',
                     'Description': 'string',
                     'CreatedOn': datetime(2015, 1, 1),
                     'LastModifiedOn': datetime(2015, 1, 1),
                     'ParameterSpec': 'string',
                     'BlueprintLocation': 'string',
                     'BlueprintServiceLocation': 'string',
                     'Status': 'CREATING'|'ACTIVE'|'UPDATING'|'FAILED',
                     'ErrorMessage': 'string',
                     'LastActiveDefinition': {
                         'Description': 'string',
                         'LastModifiedOn': datetime(2015, 1, 1),
                         'ParameterSpec': 'string',
                         'BlueprintLocation': 'string',
                         'BlueprintServiceLocation': 'string'
                     }
                 },
             ],
             'MissingBlueprints': [
                 'string',
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Blueprints** *(list) --*

          Returns a list of blueprint as a "Blueprints" object.

          * *(dict) --*

            The details of a blueprint.

            * **Name** *(string) --*

              The name of the blueprint.

            * **Description** *(string) --*

              The description of the blueprint.

            * **CreatedOn** *(datetime) --*

              The date and time the blueprint was registered.

            * **LastModifiedOn** *(datetime) --*

              The date and time the blueprint was last modified.

            * **ParameterSpec** *(string) --*

              A JSON string that indicates the list of parameter
              specifications for the blueprint.

            * **BlueprintLocation** *(string) --*

              Specifies the path in Amazon S3 where the blueprint is
              published.

            * **BlueprintServiceLocation** *(string) --*

              Specifies a path in Amazon S3 where the blueprint is
              copied when you call "CreateBlueprint/UpdateBlueprint"
              to register the blueprint in Glue.

            * **Status** *(string) --*

              The status of the blueprint registration.

              * Creating — The blueprint registration is in progress.

              * Active — The blueprint has been successfully
                registered.

              * Updating — An update to the blueprint registration is
                in progress.

              * Failed — The blueprint registration failed.

            * **ErrorMessage** *(string) --*

              An error message.

            * **LastActiveDefinition** *(dict) --*

              When there are multiple versions of a blueprint and the
              latest version has some errors, this attribute indicates
              the last successful blueprint definition that is
              available with the service.

              * **Description** *(string) --*

                The description of the blueprint.

              * **LastModifiedOn** *(datetime) --*

                The date and time the blueprint was last modified.

              * **ParameterSpec** *(string) --*

                A JSON string specifying the parameters for the
                blueprint.

              * **BlueprintLocation** *(string) --*

                Specifies a path in Amazon S3 where the blueprint is
                published by the Glue developer.

              * **BlueprintServiceLocation** *(string) --*

                Specifies a path in Amazon S3 where the blueprint is
                copied when you create or update the blueprint.

        * **MissingBlueprints** *(list) --*

          Returns a list of "BlueprintNames" that were not found.

          * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / create_crawler


create_crawler
**************

Glue.Client.create_crawler(**kwargs)

   Creates a new crawler with specified targets, role, configuration,
   and optional schedule. At least one crawl target must be specified,
   in the "s3Targets" field, the "jdbcTargets" field, or the
   "DynamoDBTargets" field.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_crawler(
          Name='string',
          Role='string',
          DatabaseName='string',
          Description='string',
          Targets={
              'S3Targets': [
                  {
                      'Path': 'string',
                      'Exclusions': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'SampleSize': 123,
                      'EventQueueArn': 'string',
                      'DlqEventQueueArn': 'string'
                  },
              ],
              'JdbcTargets': [
                  {
                      'ConnectionName': 'string',
                      'Path': 'string',
                      'Exclusions': [
                          'string',
                      ],
                      'EnableAdditionalMetadata': [
                          'COMMENTS'|'RAWTYPES',
                      ]
                  },
              ],
              'MongoDBTargets': [
                  {
                      'ConnectionName': 'string',
                      'Path': 'string',
                      'ScanAll': True|False
                  },
              ],
              'DynamoDBTargets': [
                  {
                      'Path': 'string',
                      'scanAll': True|False,
                      'scanRate': 123.0
                  },
              ],
              'CatalogTargets': [
                  {
                      'DatabaseName': 'string',
                      'Tables': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'EventQueueArn': 'string',
                      'DlqEventQueueArn': 'string'
                  },
              ],
              'DeltaTargets': [
                  {
                      'DeltaTables': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'WriteManifest': True|False,
                      'CreateNativeDeltaTable': True|False
                  },
              ],
              'IcebergTargets': [
                  {
                      'Paths': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'Exclusions': [
                          'string',
                      ],
                      'MaximumTraversalDepth': 123
                  },
              ],
              'HudiTargets': [
                  {
                      'Paths': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'Exclusions': [
                          'string',
                      ],
                      'MaximumTraversalDepth': 123
                  },
              ]
          },
          Schedule='string',
          Classifiers=[
              'string',
          ],
          TablePrefix='string',
          SchemaChangePolicy={
              'UpdateBehavior': 'LOG'|'UPDATE_IN_DATABASE',
              'DeleteBehavior': 'LOG'|'DELETE_FROM_DATABASE'|'DEPRECATE_IN_DATABASE'
          },
          RecrawlPolicy={
              'RecrawlBehavior': 'CRAWL_EVERYTHING'|'CRAWL_NEW_FOLDERS_ONLY'|'CRAWL_EVENT_MODE'
          },
          LineageConfiguration={
              'CrawlerLineageSettings': 'ENABLE'|'DISABLE'
          },
          LakeFormationConfiguration={
              'UseLakeFormationCredentials': True|False,
              'AccountId': 'string'
          },
          Configuration='string',
          CrawlerSecurityConfiguration='string',
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        Name of the new crawler.

      * **Role** (*string*) --

        **[REQUIRED]**

        The IAM role or Amazon Resource Name (ARN) of an IAM role used
        by the new crawler to access customer resources.

      * **DatabaseName** (*string*) -- The Glue database where results
        are written, such as: "arn:aws:daylight:us-
        east-1::database/sometable/*".

      * **Description** (*string*) -- A description of the new
        crawler.

      * **Targets** (*dict*) --

        **[REQUIRED]**

        A list of collection of targets to crawl.

        * **S3Targets** *(list) --*

          Specifies Amazon Simple Storage Service (Amazon S3) targets.

          * *(dict) --*

            Specifies a data store in Amazon Simple Storage Service
            (Amazon S3).

            * **Path** *(string) --*

              The path to the Amazon S3 target.

            * **Exclusions** *(list) --*

              A list of glob patterns used to exclude from the crawl.
              For more information, see Catalog Tables with a Crawler.

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of a connection which allows a job or crawler
              to access data in Amazon S3 within an Amazon Virtual
              Private Cloud environment (Amazon VPC).

            * **SampleSize** *(integer) --*

              Sets the number of files in each leaf folder to be
              crawled when crawling sample files in a dataset. If not
              set, all the files are crawled. A valid value is an
              integer between 1 and 249.

            * **EventQueueArn** *(string) --*

              A valid Amazon SQS ARN. For example,
              "arn:aws:sqs:region:account:sqs".

            * **DlqEventQueueArn** *(string) --*

              A valid Amazon dead-letter SQS ARN. For example,
              "arn:aws:sqs:region:account:deadLetterQueue".

        * **JdbcTargets** *(list) --*

          Specifies JDBC targets.

          * *(dict) --*

            Specifies a JDBC data store to crawl.

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the JDBC
              target.

            * **Path** *(string) --*

              The path of the JDBC target.

            * **Exclusions** *(list) --*

              A list of glob patterns used to exclude from the crawl.
              For more information, see Catalog Tables with a Crawler.

              * *(string) --*

            * **EnableAdditionalMetadata** *(list) --*

              Specify a value of "RAWTYPES" or "COMMENTS" to enable
              additional metadata in table responses. "RAWTYPES"
              provides the native-level datatype. "COMMENTS" provides
              comments associated with a column or table in the
              database.

              If you do not need additional metadata, keep the field
              empty.

              * *(string) --*

        * **MongoDBTargets** *(list) --*

          Specifies Amazon DocumentDB or MongoDB targets.

          * *(dict) --*

            Specifies an Amazon DocumentDB or MongoDB data store to
            crawl.

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the
              Amazon DocumentDB or MongoDB target.

            * **Path** *(string) --*

              The path of the Amazon DocumentDB or MongoDB target
              (database/collection).

            * **ScanAll** *(boolean) --*

              Indicates whether to scan all the records, or to sample
              rows from the table. Scanning all the records can take a
              long time when the table is not a high throughput table.

              A value of "true" means to scan all records, while a
              value of "false" means to sample the records. If no
              value is specified, the value defaults to "true".

        * **DynamoDBTargets** *(list) --*

          Specifies Amazon DynamoDB targets.

          * *(dict) --*

            Specifies an Amazon DynamoDB table to crawl.

            * **Path** *(string) --*

              The name of the DynamoDB table to crawl.

            * **scanAll** *(boolean) --*

              Indicates whether to scan all the records, or to sample
              rows from the table. Scanning all the records can take a
              long time when the table is not a high throughput table.

              A value of "true" means to scan all records, while a
              value of "false" means to sample the records. If no
              value is specified, the value defaults to "true".

            * **scanRate** *(float) --*

              The percentage of the configured read capacity units to
              use by the Glue crawler. Read capacity units is a term
              defined by DynamoDB, and is a numeric value that acts as
              rate limiter for the number of reads that can be
              performed on that table per second.

              The valid values are null or a value between 0.1 to 1.5.
              A null value is used when user does not provide a value,
              and defaults to 0.5 of the configured Read Capacity Unit
              (for provisioned tables), or 0.25 of the max configured
              Read Capacity Unit (for tables using on-demand mode).

        * **CatalogTargets** *(list) --*

          Specifies Glue Data Catalog targets.

          * *(dict) --*

            Specifies an Glue Data Catalog target.

            * **DatabaseName** *(string) --* **[REQUIRED]**

              The name of the database to be synchronized.

            * **Tables** *(list) --* **[REQUIRED]**

              A list of the tables to be synchronized.

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of the connection for an Amazon S3-backed Data
              Catalog table to be a target of the crawl when using a
              "Catalog" connection type paired with a "NETWORK"
              Connection type.

            * **EventQueueArn** *(string) --*

              A valid Amazon SQS ARN. For example,
              "arn:aws:sqs:region:account:sqs".

            * **DlqEventQueueArn** *(string) --*

              A valid Amazon dead-letter SQS ARN. For example,
              "arn:aws:sqs:region:account:deadLetterQueue".

        * **DeltaTargets** *(list) --*

          Specifies Delta data store targets.

          * *(dict) --*

            Specifies a Delta data store to crawl one or more Delta
            tables.

            * **DeltaTables** *(list) --*

              A list of the Amazon S3 paths to the Delta tables.

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the
              Delta table target.

            * **WriteManifest** *(boolean) --*

              Specifies whether to write the manifest files to the
              Delta table path.

            * **CreateNativeDeltaTable** *(boolean) --*

              Specifies whether the crawler will create native tables,
              to allow integration with query engines that support
              querying of the Delta transaction log directly.

        * **IcebergTargets** *(list) --*

          Specifies Apache Iceberg data store targets.

          * *(dict) --*

            Specifies an Apache Iceberg data source where Iceberg
            tables are stored in Amazon S3.

            * **Paths** *(list) --*

              One or more Amazon S3 paths that contains Iceberg
              metadata folders as "s3://bucket/prefix".

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the
              Iceberg target.

            * **Exclusions** *(list) --*

              A list of glob patterns used to exclude from the crawl.
              For more information, see Catalog Tables with a Crawler.

              * *(string) --*

            * **MaximumTraversalDepth** *(integer) --*

              The maximum depth of Amazon S3 paths that the crawler
              can traverse to discover the Iceberg metadata folder in
              your Amazon S3 path. Used to limit the crawler run time.

        * **HudiTargets** *(list) --*

          Specifies Apache Hudi data store targets.

          * *(dict) --*

            Specifies an Apache Hudi data source.

            * **Paths** *(list) --*

              An array of Amazon S3 location strings for Hudi, each
              indicating the root folder with which the metadata files
              for a Hudi table resides. The Hudi folder may be located
              in a child folder of the root folder.

              The crawler will scan all folders underneath a path for
              a Hudi folder.

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the Hudi
              target. If your Hudi files are stored in buckets that
              require VPC authorization, you can set their connection
              properties here.

            * **Exclusions** *(list) --*

              A list of glob patterns used to exclude from the crawl.
              For more information, see Catalog Tables with a Crawler.

              * *(string) --*

            * **MaximumTraversalDepth** *(integer) --*

              The maximum depth of Amazon S3 paths that the crawler
              can traverse to discover the Hudi metadata folder in
              your Amazon S3 path. Used to limit the crawler run time.

      * **Schedule** (*string*) -- A "cron" expression used to specify
        the schedule (see Time-Based Schedules for Jobs and Crawlers.
        For example, to run something every day at 12:15 UTC, you
        would specify: "cron(15 12 * * ? *)".

      * **Classifiers** (*list*) --

        A list of custom classifiers that the user has registered. By
        default, all built-in classifiers are included in a crawl, but
        these custom classifiers always override the default
        classifiers for a given classification.

        * *(string) --*

      * **TablePrefix** (*string*) -- The table prefix used for
        catalog tables that are created.

      * **SchemaChangePolicy** (*dict*) --

        The policy for the crawler's update and deletion behavior.

        * **UpdateBehavior** *(string) --*

          The update behavior when the crawler finds a changed schema.

        * **DeleteBehavior** *(string) --*

          The deletion behavior when the crawler finds a deleted
          object.

      * **RecrawlPolicy** (*dict*) --

        A policy that specifies whether to crawl the entire dataset
        again, or to crawl only folders that were added since the last
        crawler run.

        * **RecrawlBehavior** *(string) --*

          Specifies whether to crawl the entire dataset again or to
          crawl only folders that were added since the last crawler
          run.

          A value of "CRAWL_EVERYTHING" specifies crawling the entire
          dataset again.

          A value of "CRAWL_NEW_FOLDERS_ONLY" specifies crawling only
          folders that were added since the last crawler run.

          A value of "CRAWL_EVENT_MODE" specifies crawling only the
          changes identified by Amazon S3 events.

      * **LineageConfiguration** (*dict*) --

        Specifies data lineage configuration settings for the crawler.

        * **CrawlerLineageSettings** *(string) --*

          Specifies whether data lineage is enabled for the crawler.
          Valid values are:

          * ENABLE: enables data lineage for the crawler

          * DISABLE: disables data lineage for the crawler

      * **LakeFormationConfiguration** (*dict*) --

        Specifies Lake Formation configuration settings for the
        crawler.

        * **UseLakeFormationCredentials** *(boolean) --*

          Specifies whether to use Lake Formation credentials for the
          crawler instead of the IAM role credentials.

        * **AccountId** *(string) --*

          Required for cross account crawls. For same account crawls
          as the target data, this can be left as null.

      * **Configuration** (*string*) -- Crawler configuration
        information. This versioned JSON string allows users to
        specify aspects of a crawler's behavior. For more information,
        see Setting crawler configuration options.

      * **CrawlerSecurityConfiguration** (*string*) -- The name of the
        "SecurityConfiguration" structure to be used by this crawler.

      * **Tags** (*dict*) --

        The tags to use with this crawler request. You may use tags to
        limit access to the crawler. For more information about tags
        in Glue, see Amazon Web Services Tags in Glue in the developer
        guide.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / list_crawlers


list_crawlers
*************

Glue.Client.list_crawlers(**kwargs)

   Retrieves the names of all crawler resources in this Amazon Web
   Services account, or the resources with the specified tag. This
   operation allows you to see which resources are available in your
   account, and their names.

   This operation takes the optional "Tags" field, which you can use
   as a filter on the response so that tagged resources can be
   retrieved as a group. If you choose to use tags filtering, only
   resources with the tag are retrieved.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_crawlers(
          MaxResults=123,
          NextToken='string',
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

      * **Tags** (*dict*) --

        Specifies to return only these tagged resources.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'CrawlerNames': [
                 'string',
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **CrawlerNames** *(list) --*

          The names of all crawlers in the account, or the crawlers
          with the specified tags.

          * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if the returned list does not contain
          the last metric available.

   **Exceptions**

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / batch_get_dev_endpoints


batch_get_dev_endpoints
***********************

Glue.Client.batch_get_dev_endpoints(**kwargs)

   Returns a list of resource metadata for a given list of development
   endpoint names. After calling the "ListDevEndpoints" operation, you
   can call this operation to access the data to which you have been
   granted permissions. This operation supports all IAM permissions,
   including permission conditions that uses tags.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_dev_endpoints(
          DevEndpointNames=[
              'string',
          ]
      )

   Parameters:
      **DevEndpointNames** (*list*) --

      **[REQUIRED]**

      The list of "DevEndpoint" names, which might be the names
      returned from the "ListDevEndpoint" operation.

      * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'DevEndpoints': [
                 {
                     'EndpointName': 'string',
                     'RoleArn': 'string',
                     'SecurityGroupIds': [
                         'string',
                     ],
                     'SubnetId': 'string',
                     'YarnEndpointAddress': 'string',
                     'PrivateAddress': 'string',
                     'ZeppelinRemoteSparkInterpreterPort': 123,
                     'PublicAddress': 'string',
                     'Status': 'string',
                     'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                     'GlueVersion': 'string',
                     'NumberOfWorkers': 123,
                     'NumberOfNodes': 123,
                     'AvailabilityZone': 'string',
                     'VpcId': 'string',
                     'ExtraPythonLibsS3Path': 'string',
                     'ExtraJarsS3Path': 'string',
                     'FailureReason': 'string',
                     'LastUpdateStatus': 'string',
                     'CreatedTimestamp': datetime(2015, 1, 1),
                     'LastModifiedTimestamp': datetime(2015, 1, 1),
                     'PublicKey': 'string',
                     'PublicKeys': [
                         'string',
                     ],
                     'SecurityConfiguration': 'string',
                     'Arguments': {
                         'string': 'string'
                     }
                 },
             ],
             'DevEndpointsNotFound': [
                 'string',
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **DevEndpoints** *(list) --*

          A list of "DevEndpoint" definitions.

          * *(dict) --*

            A development endpoint where a developer can remotely
            debug extract, transform, and load (ETL) scripts.

            * **EndpointName** *(string) --*

              The name of the "DevEndpoint".

            * **RoleArn** *(string) --*

              The Amazon Resource Name (ARN) of the IAM role used in
              this "DevEndpoint".

            * **SecurityGroupIds** *(list) --*

              A list of security group identifiers used in this
              "DevEndpoint".

              * *(string) --*

            * **SubnetId** *(string) --*

              The subnet ID for this "DevEndpoint".

            * **YarnEndpointAddress** *(string) --*

              The YARN endpoint address used by this "DevEndpoint".

            * **PrivateAddress** *(string) --*

              A private IP address to access the "DevEndpoint" within
              a VPC if the "DevEndpoint" is created within one. The
              "PrivateAddress" field is present only when you create
              the "DevEndpoint" within your VPC.

            * **ZeppelinRemoteSparkInterpreterPort** *(integer) --*

              The Apache Zeppelin port for the remote Apache Spark
              interpreter.

            * **PublicAddress** *(string) --*

              The public IP address used by this "DevEndpoint". The
              "PublicAddress" field is present only when you create a
              non-virtual private cloud (VPC) "DevEndpoint".

            * **Status** *(string) --*

              The current status of this "DevEndpoint".

            * **WorkerType** *(string) --*

              The type of predefined worker that is allocated to the
              development endpoint. Accepts a value of Standard, G.1X,
              or G.2X.

              * For the "Standard" worker type, each worker provides 4
                vCPU, 16 GB of memory and a 50GB disk, and 2 executors
                per worker.

              * For the "G.1X" worker type, each worker maps to 1 DPU
                (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1
                executor per worker. We recommend this worker type for
                memory-intensive jobs.

              * For the "G.2X" worker type, each worker maps to 2 DPU
                (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1
                executor per worker. We recommend this worker type for
                memory-intensive jobs.

              Known issue: when a development endpoint is created with
              the "G.2X" "WorkerType" configuration, the Spark drivers
              for the development endpoint will run on 4 vCPU, 16 GB
              of memory, and a 64 GB disk.

            * **GlueVersion** *(string) --*

              Glue version determines the versions of Apache Spark and
              Python that Glue supports. The Python version indicates
              the version supported for running your ETL scripts on
              development endpoints.

              For more information about the available Glue versions
              and corresponding Spark and Python versions, see Glue
              version in the developer guide.

              Development endpoints that are created without
              specifying a Glue version default to Glue 0.9.

              You can specify a version of Python support for
              development endpoints by using the "Arguments" parameter
              in the "CreateDevEndpoint" or "UpdateDevEndpoint" APIs.
              If no arguments are provided, the version defaults to
              Python 2.

            * **NumberOfWorkers** *(integer) --*

              The number of workers of a defined "workerType" that are
              allocated to the development endpoint.

              The maximum number of workers you can define are 299 for
              "G.1X", and 149 for "G.2X".

            * **NumberOfNodes** *(integer) --*

              The number of Glue Data Processing Units (DPUs)
              allocated to this "DevEndpoint".

            * **AvailabilityZone** *(string) --*

              The Amazon Web Services Availability Zone where this
              "DevEndpoint" is located.

            * **VpcId** *(string) --*

              The ID of the virtual private cloud (VPC) used by this
              "DevEndpoint".

            * **ExtraPythonLibsS3Path** *(string) --*

              The paths to one or more Python libraries in an Amazon
              S3 bucket that should be loaded in your "DevEndpoint".
              Multiple values must be complete paths separated by a
              comma.

              Note:

                You can only use pure Python libraries with a
                "DevEndpoint". Libraries that rely on C extensions,
                such as the pandas Python data analysis library, are
                not currently supported.

            * **ExtraJarsS3Path** *(string) --*

              The path to one or more Java ".jar" files in an S3
              bucket that should be loaded in your "DevEndpoint".

              Note:

                You can only use pure Java/Scala libraries with a
                "DevEndpoint".

            * **FailureReason** *(string) --*

              The reason for a current failure in this "DevEndpoint".

            * **LastUpdateStatus** *(string) --*

              The status of the last update.

            * **CreatedTimestamp** *(datetime) --*

              The point in time at which this DevEndpoint was created.

            * **LastModifiedTimestamp** *(datetime) --*

              The point in time at which this "DevEndpoint" was last
              modified.

            * **PublicKey** *(string) --*

              The public key to be used by this "DevEndpoint" for
              authentication. This attribute is provided for backward
              compatibility because the recommended attribute to use
              is public keys.

            * **PublicKeys** *(list) --*

              A list of public keys to be used by the "DevEndpoints"
              for authentication. Using this attribute is preferred
              over a single public key because the public keys allow
              you to have a different private key per client.

              Note:

                If you previously created an endpoint with a public
                key, you must remove that key to be able to set a list
                of public keys. Call the "UpdateDevEndpoint" API
                operation with the public key content in the
                "deletePublicKeys" attribute, and the list of new keys
                in the "addPublicKeys" attribute.

              * *(string) --*

            * **SecurityConfiguration** *(string) --*

              The name of the "SecurityConfiguration" structure to be
              used with this "DevEndpoint".

            * **Arguments** *(dict) --*

              A map of arguments used to configure the "DevEndpoint".

              Valid arguments are:

              * ""--enable-glue-datacatalog": """

              You can specify a version of Python support for
              development endpoints by using the "Arguments" parameter
              in the "CreateDevEndpoint" or "UpdateDevEndpoint" APIs.
              If no arguments are provided, the version defaults to
              Python 2.

              * *(string) --*

                * *(string) --*

        * **DevEndpointsNotFound** *(list) --*

          A list of "DevEndpoints" not found.

          * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / batch_get_triggers


batch_get_triggers
******************

Glue.Client.batch_get_triggers(**kwargs)

   Returns a list of resource metadata for a given list of trigger
   names. After calling the "ListTriggers" operation, you can call
   this operation to access the data to which you have been granted
   permissions. This operation supports all IAM permissions, including
   permission conditions that uses tags.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_triggers(
          TriggerNames=[
              'string',
          ]
      )

   Parameters:
      **TriggerNames** (*list*) --

      **[REQUIRED]**

      A list of trigger names, which may be the names returned from
      the "ListTriggers" operation.

      * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Triggers': [
                 {
                     'Name': 'string',
                     'WorkflowName': 'string',
                     'Id': 'string',
                     'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                     'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                     'Description': 'string',
                     'Schedule': 'string',
                     'Actions': [
                         {
                             'JobName': 'string',
                             'Arguments': {
                                 'string': 'string'
                             },
                             'Timeout': 123,
                             'SecurityConfiguration': 'string',
                             'NotificationProperty': {
                                 'NotifyDelayAfter': 123
                             },
                             'CrawlerName': 'string'
                         },
                     ],
                     'Predicate': {
                         'Logical': 'AND'|'ANY',
                         'Conditions': [
                             {
                                 'LogicalOperator': 'EQUALS',
                                 'JobName': 'string',
                                 'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                 'CrawlerName': 'string',
                                 'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                             },
                         ]
                     },
                     'EventBatchingCondition': {
                         'BatchSize': 123,
                         'BatchWindow': 123
                     }
                 },
             ],
             'TriggersNotFound': [
                 'string',
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Triggers** *(list) --*

          A list of trigger definitions.

          * *(dict) --*

            Information about a specific trigger.

            * **Name** *(string) --*

              The name of the trigger.

            * **WorkflowName** *(string) --*

              The name of the workflow associated with the trigger.

            * **Id** *(string) --*

              Reserved for future use.

            * **Type** *(string) --*

              The type of trigger that this is.

            * **State** *(string) --*

              The current state of the trigger.

            * **Description** *(string) --*

              A description of this trigger.

            * **Schedule** *(string) --*

              A "cron" expression used to specify the schedule (see
              Time-Based Schedules for Jobs and Crawlers. For example,
              to run something every day at 12:15 UTC, you would
              specify: "cron(15 12 * * ? *)".

            * **Actions** *(list) --*

              The actions initiated by this trigger.

              * *(dict) --*

                Defines an action to be initiated by a trigger.

                * **JobName** *(string) --*

                  The name of a job to be run.

                * **Arguments** *(dict) --*

                  The job arguments used when this trigger fires. For
                  this job run, they replace the default arguments set
                  in the job definition itself.

                  You can specify arguments here that your own job-
                  execution script consumes, as well as arguments that
                  Glue itself consumes.

                  For information about how to specify and consume
                  your own Job arguments, see the Calling Glue APIs in
                  Python topic in the developer guide.

                  For information about the key-value pairs that Glue
                  consumes to set up your job, see the Special
                  Parameters Used by Glue topic in the developer
                  guide.

                  * *(string) --*

                    * *(string) --*

                * **Timeout** *(integer) --*

                  The "JobRun" timeout in minutes. This is the maximum
                  time that a job run can consume resources before it
                  is terminated and enters "TIMEOUT" status. This
                  overrides the timeout value set in the parent job.

                  Jobs must have timeout values less than 7 days or
                  10080 minutes. Otherwise, the jobs will throw an
                  exception.

                  When the value is left blank, the timeout is
                  defaulted to 2880 minutes.

                  Any existing Glue jobs that had a timeout value
                  greater than 7 days will be defaulted to 7 days. For
                  instance if you have specified a timeout of 20 days
                  for a batch job, it will be stopped on the 7th day.

                  For streaming jobs, if you have set up a maintenance
                  window, it will be restarted during the maintenance
                  window after 7 days.

                * **SecurityConfiguration** *(string) --*

                  The name of the "SecurityConfiguration" structure to
                  be used with this action.

                * **NotificationProperty** *(dict) --*

                  Specifies configuration properties of a job run
                  notification.

                  * **NotifyDelayAfter** *(integer) --*

                    After a job run starts, the number of minutes to
                    wait before sending a job run delay notification.

                * **CrawlerName** *(string) --*

                  The name of the crawler to be used with this action.

            * **Predicate** *(dict) --*

              The predicate of this trigger, which defines when it
              will fire.

              * **Logical** *(string) --*

                An optional field if only one condition is listed. If
                multiple conditions are listed, then this field is
                required.

              * **Conditions** *(list) --*

                A list of the conditions that determine when the
                trigger will fire.

                * *(dict) --*

                  Defines a condition under which a trigger fires.

                  * **LogicalOperator** *(string) --*

                    A logical operator.

                  * **JobName** *(string) --*

                    The name of the job whose "JobRuns" this condition
                    applies to, and on which this trigger waits.

                  * **State** *(string) --*

                    The condition state. Currently, the only job
                    states that a trigger can listen for are
                    "SUCCEEDED", "STOPPED", "FAILED", and "TIMEOUT".
                    The only crawler states that a trigger can listen
                    for are "SUCCEEDED", "FAILED", and "CANCELLED".

                  * **CrawlerName** *(string) --*

                    The name of the crawler to which this condition
                    applies.

                  * **CrawlState** *(string) --*

                    The state of the crawler to which this condition
                    applies.

            * **EventBatchingCondition** *(dict) --*

              Batch condition that must be met (specified number of
              events received or batch time window expired) before
              EventBridge event trigger fires.

              * **BatchSize** *(integer) --*

                Number of events that must be received from Amazon
                EventBridge before EventBridge event trigger fires.

              * **BatchWindow** *(integer) --*

                Window of time in seconds after which EventBridge
                event trigger fires. Window starts when first event is
                received.

        * **TriggersNotFound** *(list) --*

          A list of names of triggers not found.

          * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / create_data_quality_ruleset


create_data_quality_ruleset
***************************

Glue.Client.create_data_quality_ruleset(**kwargs)

   Creates a data quality ruleset with DQDL rules applied to a
   specified Glue table.

   You create the ruleset using the Data Quality Definition Language
   (DQDL). For more information, see the Glue developer guide.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_data_quality_ruleset(
          Name='string',
          Description='string',
          Ruleset='string',
          Tags={
              'string': 'string'
          },
          TargetTable={
              'TableName': 'string',
              'DatabaseName': 'string',
              'CatalogId': 'string'
          },
          DataQualitySecurityConfiguration='string',
          ClientToken='string'
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        A unique name for the data quality ruleset.

      * **Description** (*string*) -- A description of the data
        quality ruleset.

      * **Ruleset** (*string*) --

        **[REQUIRED]**

        A Data Quality Definition Language (DQDL) ruleset. For more
        information, see the Glue developer guide.

      * **Tags** (*dict*) --

        A list of tags applied to the data quality ruleset.

        * *(string) --*

          * *(string) --*

      * **TargetTable** (*dict*) --

        A target table associated with the data quality ruleset.

        * **TableName** *(string) --* **[REQUIRED]**

          The name of the Glue table.

        * **DatabaseName** *(string) --* **[REQUIRED]**

          The name of the database where the Glue table exists.

        * **CatalogId** *(string) --*

          The catalog id where the Glue table exists.

      * **DataQualitySecurityConfiguration** (*string*) -- The name of
        the security configuration created with the data quality
        encryption option.

      * **ClientToken** (*string*) -- Used for idempotency and is
        recommended to be set to a random ID (such as a UUID) to avoid
        creating or starting multiple instances of the same resource.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          A unique name for the data quality ruleset.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / update_database


update_database
***************

Glue.Client.update_database(**kwargs)

   Updates an existing database definition in a Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_database(
          CatalogId='string',
          Name='string',
          DatabaseInput={
              'Name': 'string',
              'Description': 'string',
              'LocationUri': 'string',
              'Parameters': {
                  'string': 'string'
              },
              'CreateTableDefaultPermissions': [
                  {
                      'Principal': {
                          'DataLakePrincipalIdentifier': 'string'
                      },
                      'Permissions': [
                          'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                      ]
                  },
              ],
              'TargetDatabase': {
                  'CatalogId': 'string',
                  'DatabaseName': 'string',
                  'Region': 'string'
              },
              'FederatedDatabase': {
                  'Identifier': 'string',
                  'ConnectionName': 'string',
                  'ConnectionType': 'string'
              }
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which the metadata database resides. If none is provided, the
        Amazon Web Services account ID is used by default.

      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the database to update in the catalog. For Hive
        compatibility, this is folded to lowercase.

      * **DatabaseInput** (*dict*) --

        **[REQUIRED]**

        A "DatabaseInput" object specifying the new definition of the
        metadata database in the catalog.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the database. For Hive compatibility, this is
          folded to lowercase when it is stored.

        * **Description** *(string) --*

          A description of the database.

        * **LocationUri** *(string) --*

          The location of the database (for example, an HDFS path).

        * **Parameters** *(dict) --*

          These key-value pairs define parameters and properties of
          the database.

          These key-value pairs define parameters and properties of
          the database.

          * *(string) --*

            * *(string) --*

        * **CreateTableDefaultPermissions** *(list) --*

          Creates a set of default permissions on the table for
          principals. Used by Lake Formation. Not used in the normal
          course of Glue operations.

          * *(dict) --*

            Permissions granted to a principal.

            * **Principal** *(dict) --*

              The principal who is granted permissions.

              * **DataLakePrincipalIdentifier** *(string) --*

                An identifier for the Lake Formation principal.

            * **Permissions** *(list) --*

              The permissions that are granted to the principal.

              * *(string) --*

        * **TargetDatabase** *(dict) --*

          A "DatabaseIdentifier" structure that describes a target
          database for resource linking.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the database resides.

          * **DatabaseName** *(string) --*

            The name of the catalog database.

          * **Region** *(string) --*

            Region of the target database.

        * **FederatedDatabase** *(dict) --*

          A "FederatedDatabase" structure that references an entity
          outside the Glue Data Catalog.

          * **Identifier** *(string) --*

            A unique identifier for the federated database.

          * **ConnectionName** *(string) --*

            The name of the connection to the external metastore.

          * **ConnectionType** *(string) --*

            The type of connection used to access the federated
            database, such as JDBC, ODBC, or other supported
            connection protocols.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"

   * "Glue.Client.exceptions.AlreadyExistsException"
Glue / Client / get_entity_records


get_entity_records
******************

Glue.Client.get_entity_records(**kwargs)

   This API is used to query preview data from a given connection type
   or from a native Amazon S3 based Glue Data Catalog.

   Returns records as an array of JSON blobs. Each record is formatted
   using Jackson JsonNode based on the field type defined by the
   "DescribeEntity" API.

   Spark connectors generate schemas according to the same data type
   mapping as in the "DescribeEntity" API. Spark connectors convert
   data to the appropriate data types matching the schema when
   returning rows.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_entity_records(
          ConnectionName='string',
          CatalogId='string',
          EntityName='string',
          NextToken='string',
          DataStoreApiVersion='string',
          ConnectionOptions={
              'string': 'string'
          },
          FilterPredicate='string',
          Limit=123,
          OrderBy='string',
          SelectedFields=[
              'string',
          ]
      )

   Parameters:
      * **ConnectionName** (*string*) -- The name of the connection
        that contains the connection type credentials.

      * **CatalogId** (*string*) -- The catalog ID of the catalog that
        contains the connection. This can be null, By default, the
        Amazon Web Services Account ID is the catalog ID.

      * **EntityName** (*string*) --

        **[REQUIRED]**

        Name of the entity that we want to query the preview data from
        the given connection type.

      * **NextToken** (*string*) -- A continuation token, included if
        this is a continuation call.

      * **DataStoreApiVersion** (*string*) -- The API version of the
        SaaS connector.

      * **ConnectionOptions** (*dict*) --

        Connector options that are required to query the data.

        * *(string) --*

          * *(string) --*

      * **FilterPredicate** (*string*) -- A filter predicate that you
        can apply in the query request.

      * **Limit** (*integer*) --

        **[REQUIRED]**

        Limits the number of records fetched with the request.

      * **OrderBy** (*string*) -- A parameter that orders the response
        preview data.

      * **SelectedFields** (*list*) --

        List of fields that we want to fetch as part of preview data.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Records': [
                 {...}|[...]|123|123.4|'string'|True|None,
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Records** *(list) --*

          A list of the requested objects.

          * (*document*) --

        * **NextToken** *(string) --*

          A continuation token, present if the current segment is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.AccessDeniedException"
Glue / Client / update_source_control_from_job


update_source_control_from_job
******************************

Glue.Client.update_source_control_from_job(**kwargs)

   Synchronizes a job to the source control repository. This operation
   takes the job artifacts from the Glue internal stores and makes a
   commit to the remote repository that is configured on the job.

   This API supports optional parameters which take in the repository
   information.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_source_control_from_job(
          JobName='string',
          Provider='GITHUB'|'GITLAB'|'BITBUCKET'|'AWS_CODE_COMMIT',
          RepositoryName='string',
          RepositoryOwner='string',
          BranchName='string',
          Folder='string',
          CommitId='string',
          AuthStrategy='PERSONAL_ACCESS_TOKEN'|'AWS_SECRETS_MANAGER',
          AuthToken='string'
      )

   Parameters:
      * **JobName** (*string*) -- The name of the Glue job to be
        synchronized to or from the remote repository.

      * **Provider** (*string*) -- The provider for the remote
        repository. Possible values: GITHUB, AWS_CODE_COMMIT, GITLAB,
        BITBUCKET.

      * **RepositoryName** (*string*) -- The name of the remote
        repository that contains the job artifacts. For BitBucket
        providers, "RepositoryName" should include "WorkspaceName".
        Use the format "<WorkspaceName>/<RepositoryName>".

      * **RepositoryOwner** (*string*) -- The owner of the remote
        repository that contains the job artifacts.

      * **BranchName** (*string*) -- An optional branch in the remote
        repository.

      * **Folder** (*string*) -- An optional folder in the remote
        repository.

      * **CommitId** (*string*) -- A commit ID for a commit in the
        remote repository.

      * **AuthStrategy** (*string*) -- The type of authentication,
        which can be an authentication token stored in Amazon Web
        Services Secrets Manager, or a personal access token.

      * **AuthToken** (*string*) -- The value of the authorization
        token.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobName': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **JobName** *(string) --*

          The name of the Glue job.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / stop_column_statistics_task_run_schedule


stop_column_statistics_task_run_schedule
****************************************

Glue.Client.stop_column_statistics_task_run_schedule(**kwargs)

   Stops a column statistics task run schedule.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.stop_column_statistics_task_run_schedule(
          DatabaseName='string',
          TableName='string'
      )

   Parameters:
      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database where the table resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table for which to stop a column statistic
        task run schedule.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / list_entities


list_entities
*************

Glue.Client.list_entities(**kwargs)

   Returns the available entities supported by the connection type.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_entities(
          ConnectionName='string',
          CatalogId='string',
          ParentEntityName='string',
          NextToken='string',
          DataStoreApiVersion='string'
      )

   Parameters:
      * **ConnectionName** (*string*) -- A name for the connection
        that has required credentials to query any connection type.

      * **CatalogId** (*string*) -- The catalog ID of the catalog that
        contains the connection. This can be null, By default, the
        Amazon Web Services Account ID is the catalog ID.

      * **ParentEntityName** (*string*) -- Name of the parent entity
        for which you want to list the children. This parameter takes
        a fully-qualified path of the entity in order to list the
        child entities.

      * **NextToken** (*string*) -- A continuation token, included if
        this is a continuation call.

      * **DataStoreApiVersion** (*string*) -- The API version of the
        SaaS connector.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Entities': [
                 {
                     'EntityName': 'string',
                     'Label': 'string',
                     'IsParentEntity': True|False,
                     'Description': 'string',
                     'Category': 'string',
                     'CustomProperties': {
                         'string': 'string'
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Entities** *(list) --*

          A list of "Entity" objects.

          * *(dict) --*

            An entity supported by a given "ConnectionType".

            * **EntityName** *(string) --*

              The name of the entity.

            * **Label** *(string) --*

              Label used for the entity.

            * **IsParentEntity** *(boolean) --*

              A Boolean value which helps to determine whether there
              are sub objects that can be listed.

            * **Description** *(string) --*

              A description of the entity.

            * **Category** *(string) --*

              The type of entities that are present in the response.
              This value depends on the source connection. For example
              this is "SObjects" for Salesforce and "databases" or
              "schemas" or "tables" for sources like Amazon Redshift.

            * **CustomProperties** *(dict) --*

              An optional map of keys which may be returned for an
              entity by a connector.

              * *(string) --*

                * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, present if the current segment is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.AccessDeniedException"
Glue / Client / start_crawler


start_crawler
*************

Glue.Client.start_crawler(**kwargs)

   Starts a crawl using the specified crawler, regardless of what is
   scheduled. If the crawler is already running, returns a
   CrawlerRunningException.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_crawler(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      Name of the crawler to start.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.CrawlerRunningException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / start_ml_evaluation_task_run


start_ml_evaluation_task_run
****************************

Glue.Client.start_ml_evaluation_task_run(**kwargs)

   Starts a task to estimate the quality of the transform.

   When you provide label sets as examples of truth, Glue machine
   learning uses some of those examples to learn from them. The rest
   of the labels are used as a test to estimate quality.

   Returns a unique identifier for the run. You can call
   "GetMLTaskRun" to get more information about the stats of the
   "EvaluationTaskRun".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_ml_evaluation_task_run(
          TransformId='string'
      )

   Parameters:
      **TransformId** (*string*) --

      **[REQUIRED]**

      The unique identifier of the machine learning transform.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TaskRunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TaskRunId** *(string) --*

          The unique identifier associated with this run.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ConcurrentRunsExceededException"

   * "Glue.Client.exceptions.MLTransformNotReadyException"
Glue / Client / get_schema


get_schema
**********

Glue.Client.get_schema(**kwargs)

   Describes the specified schema in detail.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_schema(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          }
      )

   Parameters:
      **SchemaId** (*dict*) --

      **[REQUIRED]**

      This is a wrapper structure to contain schema identity fields.
      The structure contains:

      * SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the
        schema. Either "SchemaArn" or "SchemaName" and "RegistryName"
        has to be provided.

      * SchemaId$SchemaName: The name of the schema. Either
        "SchemaArn" or "SchemaName" and "RegistryName" has to be
        provided.

      * **SchemaArn** *(string) --*

        The Amazon Resource Name (ARN) of the schema. One of
        "SchemaArn" or "SchemaName" has to be provided.

      * **SchemaName** *(string) --*

        The name of the schema. One of "SchemaArn" or "SchemaName" has
        to be provided.

      * **RegistryName** *(string) --*

        The name of the schema registry that contains the schema.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RegistryName': 'string',
             'RegistryArn': 'string',
             'SchemaName': 'string',
             'SchemaArn': 'string',
             'Description': 'string',
             'DataFormat': 'AVRO'|'JSON'|'PROTOBUF',
             'Compatibility': 'NONE'|'DISABLED'|'BACKWARD'|'BACKWARD_ALL'|'FORWARD'|'FORWARD_ALL'|'FULL'|'FULL_ALL',
             'SchemaCheckpoint': 123,
             'LatestSchemaVersion': 123,
             'NextSchemaVersion': 123,
             'SchemaStatus': 'AVAILABLE'|'PENDING'|'DELETING',
             'CreatedTime': 'string',
             'UpdatedTime': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **RegistryName** *(string) --*

          The name of the registry.

        * **RegistryArn** *(string) --*

          The Amazon Resource Name (ARN) of the registry.

        * **SchemaName** *(string) --*

          The name of the schema.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema.

        * **Description** *(string) --*

          A description of schema if specified when created

        * **DataFormat** *(string) --*

          The data format of the schema definition. Currently "AVRO",
          "JSON" and "PROTOBUF" are supported.

        * **Compatibility** *(string) --*

          The compatibility mode of the schema.

        * **SchemaCheckpoint** *(integer) --*

          The version number of the checkpoint (the last time the
          compatibility mode was changed).

        * **LatestSchemaVersion** *(integer) --*

          The latest version of the schema associated with the
          returned schema definition.

        * **NextSchemaVersion** *(integer) --*

          The next version of the schema associated with the returned
          schema definition.

        * **SchemaStatus** *(string) --*

          The status of the schema.

        * **CreatedTime** *(string) --*

          The date and time the schema was created.

        * **UpdatedTime** *(string) --*

          The date and time the schema was updated.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / batch_get_workflows


batch_get_workflows
*******************

Glue.Client.batch_get_workflows(**kwargs)

   Returns a list of resource metadata for a given list of workflow
   names. After calling the "ListWorkflows" operation, you can call
   this operation to access the data to which you have been granted
   permissions. This operation supports all IAM permissions, including
   permission conditions that uses tags.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_workflows(
          Names=[
              'string',
          ],
          IncludeGraph=True|False
      )

   Parameters:
      * **Names** (*list*) --

        **[REQUIRED]**

        A list of workflow names, which may be the names returned from
        the "ListWorkflows" operation.

        * *(string) --*

      * **IncludeGraph** (*boolean*) -- Specifies whether to include a
        graph when returning the workflow resource metadata.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Workflows': [
                 {
                     'Name': 'string',
                     'Description': 'string',
                     'DefaultRunProperties': {
                         'string': 'string'
                     },
                     'CreatedOn': datetime(2015, 1, 1),
                     'LastModifiedOn': datetime(2015, 1, 1),
                     'LastRun': {
                         'Name': 'string',
                         'WorkflowRunId': 'string',
                         'PreviousRunId': 'string',
                         'WorkflowRunProperties': {
                             'string': 'string'
                         },
                         'StartedOn': datetime(2015, 1, 1),
                         'CompletedOn': datetime(2015, 1, 1),
                         'Status': 'RUNNING'|'COMPLETED'|'STOPPING'|'STOPPED'|'ERROR',
                         'ErrorMessage': 'string',
                         'Statistics': {
                             'TotalActions': 123,
                             'TimeoutActions': 123,
                             'FailedActions': 123,
                             'StoppedActions': 123,
                             'SucceededActions': 123,
                             'RunningActions': 123,
                             'ErroredActions': 123,
                             'WaitingActions': 123
                         },
                         'Graph': {
                             'Nodes': [
                                 {
                                     'Type': 'CRAWLER'|'JOB'|'TRIGGER',
                                     'Name': 'string',
                                     'UniqueId': 'string',
                                     'TriggerDetails': {
                                         'Trigger': {
                                             'Name': 'string',
                                             'WorkflowName': 'string',
                                             'Id': 'string',
                                             'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                                             'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                                             'Description': 'string',
                                             'Schedule': 'string',
                                             'Actions': [
                                                 {
                                                     'JobName': 'string',
                                                     'Arguments': {
                                                         'string': 'string'
                                                     },
                                                     'Timeout': 123,
                                                     'SecurityConfiguration': 'string',
                                                     'NotificationProperty': {
                                                         'NotifyDelayAfter': 123
                                                     },
                                                     'CrawlerName': 'string'
                                                 },
                                             ],
                                             'Predicate': {
                                                 'Logical': 'AND'|'ANY',
                                                 'Conditions': [
                                                     {
                                                         'LogicalOperator': 'EQUALS',
                                                         'JobName': 'string',
                                                         'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                                         'CrawlerName': 'string',
                                                         'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                                                     },
                                                 ]
                                             },
                                             'EventBatchingCondition': {
                                                 'BatchSize': 123,
                                                 'BatchWindow': 123
                                             }
                                         }
                                     },
                                     'JobDetails': {
                                         'JobRuns': [
                                             {
                                                 'Id': 'string',
                                                 'Attempt': 123,
                                                 'PreviousRunId': 'string',
                                                 'TriggerName': 'string',
                                                 'JobName': 'string',
                                                 'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                                                 'JobRunQueuingEnabled': True|False,
                                                 'StartedOn': datetime(2015, 1, 1),
                                                 'LastModifiedOn': datetime(2015, 1, 1),
                                                 'CompletedOn': datetime(2015, 1, 1),
                                                 'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                                 'Arguments': {
                                                     'string': 'string'
                                                 },
                                                 'ErrorMessage': 'string',
                                                 'PredecessorRuns': [
                                                     {
                                                         'JobName': 'string',
                                                         'RunId': 'string'
                                                     },
                                                 ],
                                                 'AllocatedCapacity': 123,
                                                 'ExecutionTime': 123,
                                                 'Timeout': 123,
                                                 'MaxCapacity': 123.0,
                                                 'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                                                 'NumberOfWorkers': 123,
                                                 'SecurityConfiguration': 'string',
                                                 'LogGroupName': 'string',
                                                 'NotificationProperty': {
                                                     'NotifyDelayAfter': 123
                                                 },
                                                 'GlueVersion': 'string',
                                                 'DPUSeconds': 123.0,
                                                 'ExecutionClass': 'FLEX'|'STANDARD',
                                                 'MaintenanceWindow': 'string',
                                                 'ProfileName': 'string',
                                                 'StateDetail': 'string',
                                                 'ExecutionRoleSessionPolicy': 'string'
                                             },
                                         ]
                                     },
                                     'CrawlerDetails': {
                                         'Crawls': [
                                             {
                                                 'State': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR',
                                                 'StartedOn': datetime(2015, 1, 1),
                                                 'CompletedOn': datetime(2015, 1, 1),
                                                 'ErrorMessage': 'string',
                                                 'LogGroup': 'string',
                                                 'LogStream': 'string'
                                             },
                                         ]
                                     }
                                 },
                             ],
                             'Edges': [
                                 {
                                     'SourceId': 'string',
                                     'DestinationId': 'string'
                                 },
                             ]
                         },
                         'StartingEventBatchCondition': {
                             'BatchSize': 123,
                             'BatchWindow': 123
                         }
                     },
                     'Graph': {
                         'Nodes': [
                             {
                                 'Type': 'CRAWLER'|'JOB'|'TRIGGER',
                                 'Name': 'string',
                                 'UniqueId': 'string',
                                 'TriggerDetails': {
                                     'Trigger': {
                                         'Name': 'string',
                                         'WorkflowName': 'string',
                                         'Id': 'string',
                                         'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                                         'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                                         'Description': 'string',
                                         'Schedule': 'string',
                                         'Actions': [
                                             {
                                                 'JobName': 'string',
                                                 'Arguments': {
                                                     'string': 'string'
                                                 },
                                                 'Timeout': 123,
                                                 'SecurityConfiguration': 'string',
                                                 'NotificationProperty': {
                                                     'NotifyDelayAfter': 123
                                                 },
                                                 'CrawlerName': 'string'
                                             },
                                         ],
                                         'Predicate': {
                                             'Logical': 'AND'|'ANY',
                                             'Conditions': [
                                                 {
                                                     'LogicalOperator': 'EQUALS',
                                                     'JobName': 'string',
                                                     'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                                     'CrawlerName': 'string',
                                                     'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                                                 },
                                             ]
                                         },
                                         'EventBatchingCondition': {
                                             'BatchSize': 123,
                                             'BatchWindow': 123
                                         }
                                     }
                                 },
                                 'JobDetails': {
                                     'JobRuns': [
                                         {
                                             'Id': 'string',
                                             'Attempt': 123,
                                             'PreviousRunId': 'string',
                                             'TriggerName': 'string',
                                             'JobName': 'string',
                                             'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                                             'JobRunQueuingEnabled': True|False,
                                             'StartedOn': datetime(2015, 1, 1),
                                             'LastModifiedOn': datetime(2015, 1, 1),
                                             'CompletedOn': datetime(2015, 1, 1),
                                             'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                             'Arguments': {
                                                 'string': 'string'
                                             },
                                             'ErrorMessage': 'string',
                                             'PredecessorRuns': [
                                                 {
                                                     'JobName': 'string',
                                                     'RunId': 'string'
                                                 },
                                             ],
                                             'AllocatedCapacity': 123,
                                             'ExecutionTime': 123,
                                             'Timeout': 123,
                                             'MaxCapacity': 123.0,
                                             'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                                             'NumberOfWorkers': 123,
                                             'SecurityConfiguration': 'string',
                                             'LogGroupName': 'string',
                                             'NotificationProperty': {
                                                 'NotifyDelayAfter': 123
                                             },
                                             'GlueVersion': 'string',
                                             'DPUSeconds': 123.0,
                                             'ExecutionClass': 'FLEX'|'STANDARD',
                                             'MaintenanceWindow': 'string',
                                             'ProfileName': 'string',
                                             'StateDetail': 'string',
                                             'ExecutionRoleSessionPolicy': 'string'
                                         },
                                     ]
                                 },
                                 'CrawlerDetails': {
                                     'Crawls': [
                                         {
                                             'State': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR',
                                             'StartedOn': datetime(2015, 1, 1),
                                             'CompletedOn': datetime(2015, 1, 1),
                                             'ErrorMessage': 'string',
                                             'LogGroup': 'string',
                                             'LogStream': 'string'
                                         },
                                     ]
                                 }
                             },
                         ],
                         'Edges': [
                             {
                                 'SourceId': 'string',
                                 'DestinationId': 'string'
                             },
                         ]
                     },
                     'MaxConcurrentRuns': 123,
                     'BlueprintDetails': {
                         'BlueprintName': 'string',
                         'RunId': 'string'
                     }
                 },
             ],
             'MissingWorkflows': [
                 'string',
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Workflows** *(list) --*

          A list of workflow resource metadata.

          * *(dict) --*

            A workflow is a collection of multiple dependent Glue jobs
            and crawlers that are run to complete a complex ETL task.
            A workflow manages the execution and monitoring of all its
            jobs and crawlers.

            * **Name** *(string) --*

              The name of the workflow.

            * **Description** *(string) --*

              A description of the workflow.

            * **DefaultRunProperties** *(dict) --*

              A collection of properties to be used as part of each
              execution of the workflow. The run properties are made
              available to each job in the workflow. A job can modify
              the properties for the next jobs in the flow.

              * *(string) --*

                * *(string) --*

            * **CreatedOn** *(datetime) --*

              The date and time when the workflow was created.

            * **LastModifiedOn** *(datetime) --*

              The date and time when the workflow was last modified.

            * **LastRun** *(dict) --*

              The information about the last execution of the
              workflow.

              * **Name** *(string) --*

                Name of the workflow that was run.

              * **WorkflowRunId** *(string) --*

                The ID of this workflow run.

              * **PreviousRunId** *(string) --*

                The ID of the previous workflow run.

              * **WorkflowRunProperties** *(dict) --*

                The workflow run properties which were set during the
                run.

                * *(string) --*

                  * *(string) --*

              * **StartedOn** *(datetime) --*

                The date and time when the workflow run was started.

              * **CompletedOn** *(datetime) --*

                The date and time when the workflow run completed.

              * **Status** *(string) --*

                The status of the workflow run.

              * **ErrorMessage** *(string) --*

                This error message describes any error that may have
                occurred in starting the workflow run. Currently the
                only error message is "Concurrent runs exceeded for
                workflow: "foo"."

              * **Statistics** *(dict) --*

                The statistics of the run.

                * **TotalActions** *(integer) --*

                  Total number of Actions in the workflow run.

                * **TimeoutActions** *(integer) --*

                  Total number of Actions that timed out.

                * **FailedActions** *(integer) --*

                  Total number of Actions that have failed.

                * **StoppedActions** *(integer) --*

                  Total number of Actions that have stopped.

                * **SucceededActions** *(integer) --*

                  Total number of Actions that have succeeded.

                * **RunningActions** *(integer) --*

                  Total number Actions in running state.

                * **ErroredActions** *(integer) --*

                  Indicates the count of job runs in the ERROR state
                  in the workflow run.

                * **WaitingActions** *(integer) --*

                  Indicates the count of job runs in WAITING state in
                  the workflow run.

              * **Graph** *(dict) --*

                The graph representing all the Glue components that
                belong to the workflow as nodes and directed
                connections between them as edges.

                * **Nodes** *(list) --*

                  A list of the the Glue components belong to the
                  workflow represented as nodes.

                  * *(dict) --*

                    A node represents an Glue component (trigger,
                    crawler, or job) on a workflow graph.

                    * **Type** *(string) --*

                      The type of Glue component represented by the
                      node.

                    * **Name** *(string) --*

                      The name of the Glue component represented by
                      the node.

                    * **UniqueId** *(string) --*

                      The unique Id assigned to the node within the
                      workflow.

                    * **TriggerDetails** *(dict) --*

                      Details of the Trigger when the node represents
                      a Trigger.

                      * **Trigger** *(dict) --*

                        The information of the trigger represented by
                        the trigger node.

                        * **Name** *(string) --*

                          The name of the trigger.

                        * **WorkflowName** *(string) --*

                          The name of the workflow associated with the
                          trigger.

                        * **Id** *(string) --*

                          Reserved for future use.

                        * **Type** *(string) --*

                          The type of trigger that this is.

                        * **State** *(string) --*

                          The current state of the trigger.

                        * **Description** *(string) --*

                          A description of this trigger.

                        * **Schedule** *(string) --*

                          A "cron" expression used to specify the
                          schedule (see Time-Based Schedules for Jobs
                          and Crawlers. For example, to run something
                          every day at 12:15 UTC, you would specify:
                          "cron(15 12 * * ? *)".

                        * **Actions** *(list) --*

                          The actions initiated by this trigger.

                          * *(dict) --*

                            Defines an action to be initiated by a
                            trigger.

                            * **JobName** *(string) --*

                              The name of a job to be run.

                            * **Arguments** *(dict) --*

                              The job arguments used when this trigger
                              fires. For this job run, they replace
                              the default arguments set in the job
                              definition itself.

                              You can specify arguments here that your
                              own job-execution script consumes, as
                              well as arguments that Glue itself
                              consumes.

                              For information about how to specify and
                              consume your own Job arguments, see the
                              Calling Glue APIs in Python topic in the
                              developer guide.

                              For information about the key-value
                              pairs that Glue consumes to set up your
                              job, see the Special Parameters Used by
                              Glue topic in the developer guide.

                              * *(string) --*

                                * *(string) --*

                            * **Timeout** *(integer) --*

                              The "JobRun" timeout in minutes. This is
                              the maximum time that a job run can
                              consume resources before it is
                              terminated and enters "TIMEOUT" status.
                              This overrides the timeout value set in
                              the parent job.

                              Jobs must have timeout values less than
                              7 days or 10080 minutes. Otherwise, the
                              jobs will throw an exception.

                              When the value is left blank, the
                              timeout is defaulted to 2880 minutes.

                              Any existing Glue jobs that had a
                              timeout value greater than 7 days will
                              be defaulted to 7 days. For instance if
                              you have specified a timeout of 20 days
                              for a batch job, it will be stopped on
                              the 7th day.

                              For streaming jobs, if you have set up a
                              maintenance window, it will be restarted
                              during the maintenance window after 7
                              days.

                            * **SecurityConfiguration** *(string) --*

                              The name of the "SecurityConfiguration"
                              structure to be used with this action.

                            * **NotificationProperty** *(dict) --*

                              Specifies configuration properties of a
                              job run notification.

                              * **NotifyDelayAfter** *(integer) --*

                                After a job run starts, the number of
                                minutes to wait before sending a job
                                run delay notification.

                            * **CrawlerName** *(string) --*

                              The name of the crawler to be used with
                              this action.

                        * **Predicate** *(dict) --*

                          The predicate of this trigger, which defines
                          when it will fire.

                          * **Logical** *(string) --*

                            An optional field if only one condition is
                            listed. If multiple conditions are listed,
                            then this field is required.

                          * **Conditions** *(list) --*

                            A list of the conditions that determine
                            when the trigger will fire.

                            * *(dict) --*

                              Defines a condition under which a
                              trigger fires.

                              * **LogicalOperator** *(string) --*

                                A logical operator.

                              * **JobName** *(string) --*

                                The name of the job whose "JobRuns"
                                this condition applies to, and on
                                which this trigger waits.

                              * **State** *(string) --*

                                The condition state. Currently, the
                                only job states that a trigger can
                                listen for are "SUCCEEDED", "STOPPED",
                                "FAILED", and "TIMEOUT". The only
                                crawler states that a trigger can
                                listen for are "SUCCEEDED", "FAILED",
                                and "CANCELLED".

                              * **CrawlerName** *(string) --*

                                The name of the crawler to which this
                                condition applies.

                              * **CrawlState** *(string) --*

                                The state of the crawler to which this
                                condition applies.

                        * **EventBatchingCondition** *(dict) --*

                          Batch condition that must be met (specified
                          number of events received or batch time
                          window expired) before EventBridge event
                          trigger fires.

                          * **BatchSize** *(integer) --*

                            Number of events that must be received
                            from Amazon EventBridge before EventBridge
                            event trigger fires.

                          * **BatchWindow** *(integer) --*

                            Window of time in seconds after which
                            EventBridge event trigger fires. Window
                            starts when first event is received.

                    * **JobDetails** *(dict) --*

                      Details of the Job when the node represents a
                      Job.

                      * **JobRuns** *(list) --*

                        The information for the job runs represented
                        by the job node.

                        * *(dict) --*

                          Contains information about a job run.

                          * **Id** *(string) --*

                            The ID of this job run.

                          * **Attempt** *(integer) --*

                            The number of the attempt to run this job.

                          * **PreviousRunId** *(string) --*

                            The ID of the previous run of this job.
                            For example, the "JobRunId" specified in
                            the "StartJobRun" action.

                          * **TriggerName** *(string) --*

                            The name of the trigger that started this
                            job run.

                          * **JobName** *(string) --*

                            The name of the job definition being used
                            in this run.

                          * **JobMode** *(string) --*

                            A mode that describes how a job was
                            created. Valid values are:

                            * "SCRIPT" - The job was created using the
                              Glue Studio script editor.

                            * "VISUAL" - The job was created using the
                              Glue Studio visual editor.

                            * "NOTEBOOK" - The job was created using
                              an interactive sessions notebook.

                            When the "JobMode" field is missing or
                            null, "SCRIPT" is assigned as the default
                            value.

                          * **JobRunQueuingEnabled** *(boolean) --*

                            Specifies whether job run queuing is
                            enabled for the job run.

                            A value of true means job run queuing is
                            enabled for the job run. If false or not
                            populated, the job run will not be
                            considered for queueing.

                          * **StartedOn** *(datetime) --*

                            The date and time at which this job run
                            was started.

                          * **LastModifiedOn** *(datetime) --*

                            The last time that this job run was
                            modified.

                          * **CompletedOn** *(datetime) --*

                            The date and time that this job run
                            completed.

                          * **JobRunState** *(string) --*

                            The current state of the job run. For more
                            information about the statuses of jobs
                            that have terminated abnormally, see Glue
                            Job Run Statuses.

                          * **Arguments** *(dict) --*

                            The job arguments associated with this
                            run. For this job run, they replace the
                            default arguments set in the job
                            definition itself.

                            You can specify arguments here that your
                            own job-execution script consumes, as well
                            as arguments that Glue itself consumes.

                            Job arguments may be logged. Do not pass
                            plaintext secrets as arguments. Retrieve
                            secrets from a Glue Connection, Secrets
                            Manager or other secret management
                            mechanism if you intend to keep them
                            within the Job.

                            For information about how to specify and
                            consume your own Job arguments, see the
                            Calling Glue APIs in Python topic in the
                            developer guide.

                            For information about the arguments you
                            can provide to this field when configuring
                            Spark jobs, see the Special Parameters
                            Used by Glue topic in the developer guide.

                            For information about the arguments you
                            can provide to this field when configuring
                            Ray jobs, see Using job parameters in Ray
                            jobs in the developer guide.

                            * *(string) --*

                              * *(string) --*

                          * **ErrorMessage** *(string) --*

                            An error message associated with this job
                            run.

                          * **PredecessorRuns** *(list) --*

                            A list of predecessors to this job run.

                            * *(dict) --*

                              A job run that was used in the predicate
                              of a conditional trigger that triggered
                              this job run.

                              * **JobName** *(string) --*

                                The name of the job definition used by
                                the predecessor job run.

                              * **RunId** *(string) --*

                                The job-run ID of the predecessor job
                                run.

                          * **AllocatedCapacity** *(integer) --*

                            This field is deprecated. Use
                            "MaxCapacity" instead.

                            The number of Glue data processing units
                            (DPUs) allocated to this JobRun. From 2 to
                            100 DPUs can be allocated; the default is
                            10. A DPU is a relative measure of
                            processing power that consists of 4 vCPUs
                            of compute capacity and 16 GB of memory.
                            For more information, see the Glue pricing
                            page.

                          * **ExecutionTime** *(integer) --*

                            The amount of time (in seconds) that the
                            job run consumed resources.

                          * **Timeout** *(integer) --*

                            The "JobRun" timeout in minutes. This is
                            the maximum time that a job run can
                            consume resources before it is terminated
                            and enters "TIMEOUT" status. This value
                            overrides the timeout value set in the
                            parent job.

                            Jobs must have timeout values less than 7
                            days or 10080 minutes. Otherwise, the jobs
                            will throw an exception.

                            When the value is left blank, the timeout
                            is defaulted to 2880 minutes.

                            Any existing Glue jobs that had a timeout
                            value greater than 7 days will be
                            defaulted to 7 days. For instance if you
                            have specified a timeout of 20 days for a
                            batch job, it will be stopped on the 7th
                            day.

                            For streaming jobs, if you have set up a
                            maintenance window, it will be restarted
                            during the maintenance window after 7
                            days.

                          * **MaxCapacity** *(float) --*

                            For Glue version 1.0 or earlier jobs,
                            using the standard worker type, the number
                            of Glue data processing units (DPUs) that
                            can be allocated when this job runs. A DPU
                            is a relative measure of processing power
                            that consists of 4 vCPUs of compute
                            capacity and 16 GB of memory. For more
                            information, see the Glue pricing page.

                            For Glue version 2.0+ jobs, you cannot
                            specify a "Maximum capacity". Instead, you
                            should specify a "Worker type" and the
                            "Number of workers".

                            Do not set "MaxCapacity" if using
                            "WorkerType" and "NumberOfWorkers".

                            The value that can be allocated for
                            "MaxCapacity" depends on whether you are
                            running a Python shell job, an Apache
                            Spark ETL job, or an Apache Spark
                            streaming ETL job:

                            * When you specify a Python shell job (
                              >>``<<JobCommand.Name``="pythonshell"),
                              you can allocate either 0.0625 or 1 DPU.
                              The default is 0.0625 DPU.

                            * When you specify an Apache Spark ETL job
                              ( >>``<<JobCommand.Name``="glueetl") or
                              Apache Spark streaming ETL job ( >>``<<
                              JobCommand.Name``="gluestreaming"), you
                              can allocate from 2 to 100 DPUs. The
                              default is 10 DPUs. This job type cannot
                              have a fractional DPU allocation.

                          * **WorkerType** *(string) --*

                            The type of predefined worker that is
                            allocated when a job runs. Accepts a value
                            of G.1X, G.2X, G.4X, G.8X or G.025X for
                            Spark jobs. Accepts the value Z.2X for Ray
                            jobs.

                            * For the "G.1X" worker type, each worker
                              maps to 1 DPU (4 vCPUs, 16 GB of memory)
                              with 94GB disk, and provides 1 executor
                              per worker. We recommend this worker
                              type for workloads such as data
                              transforms, joins, and queries, to
                              offers a scalable and cost effective way
                              to run most jobs.

                            * For the "G.2X" worker type, each worker
                              maps to 2 DPU (8 vCPUs, 32 GB of memory)
                              with 138GB disk, and provides 1 executor
                              per worker. We recommend this worker
                              type for workloads such as data
                              transforms, joins, and queries, to
                              offers a scalable and cost effective way
                              to run most jobs.

                            * For the "G.4X" worker type, each worker
                              maps to 4 DPU (16 vCPUs, 64 GB of
                              memory) with 256GB disk, and provides 1
                              executor per worker. We recommend this
                              worker type for jobs whose workloads
                              contain your most demanding transforms,
                              aggregations, joins, and queries. This
                              worker type is available only for Glue
                              version 3.0 or later Spark ETL jobs in
                              the following Amazon Web Services
                              Regions: US East (Ohio), US East (N.
                              Virginia), US West (Oregon), Asia
                              Pacific (Singapore), Asia Pacific
                              (Sydney), Asia Pacific (Tokyo), Canada
                              (Central), Europe (Frankfurt), Europe
                              (Ireland), and Europe (Stockholm).

                            * For the "G.8X" worker type, each worker
                              maps to 8 DPU (32 vCPUs, 128 GB of
                              memory) with 512GB disk, and provides 1
                              executor per worker. We recommend this
                              worker type for jobs whose workloads
                              contain your most demanding transforms,
                              aggregations, joins, and queries. This
                              worker type is available only for Glue
                              version 3.0 or later Spark ETL jobs, in
                              the same Amazon Web Services Regions as
                              supported for the "G.4X" worker type.

                            * For the "G.025X" worker type, each
                              worker maps to 0.25 DPU (2 vCPUs, 4 GB
                              of memory) with 84GB disk, and provides
                              1 executor per worker. We recommend this
                              worker type for low volume streaming
                              jobs. This worker type is only available
                              for Glue version 3.0 or later streaming
                              jobs.

                            * For the "Z.2X" worker type, each worker
                              maps to 2 M-DPU (8vCPUs, 64 GB of
                              memory) with 128 GB disk, and provides
                              up to 8 Ray workers based on the
                              autoscaler.

                          * **NumberOfWorkers** *(integer) --*

                            The number of workers of a defined
                            "workerType" that are allocated when a job
                            runs.

                          * **SecurityConfiguration** *(string) --*

                            The name of the "SecurityConfiguration"
                            structure to be used with this job run.

                          * **LogGroupName** *(string) --*

                            The name of the log group for secure
                            logging that can be server-side encrypted
                            in Amazon CloudWatch using KMS. This name
                            can be "/aws-glue/jobs/", in which case
                            the default encryption is "NONE". If you
                            add a role name and
                            "SecurityConfiguration" name (in other
                            words, "/aws-glue/jobs-yourRoleName-
                            yourSecurityConfigurationName/"), then
                            that security configuration is used to
                            encrypt the log group.

                          * **NotificationProperty** *(dict) --*

                            Specifies configuration properties of a
                            job run notification.

                            * **NotifyDelayAfter** *(integer) --*

                              After a job run starts, the number of
                              minutes to wait before sending a job run
                              delay notification.

                          * **GlueVersion** *(string) --*

                            In Spark jobs, "GlueVersion" determines
                            the versions of Apache Spark and Python
                            that Glue available in a job. The Python
                            version indicates the version supported
                            for jobs of type Spark.

                            Ray jobs should set "GlueVersion" to "4.0"
                            or greater. However, the versions of Ray,
                            Python and additional libraries available
                            in your Ray job are determined by the
                            "Runtime" parameter of the Job command.

                            For more information about the available
                            Glue versions and corresponding Spark and
                            Python versions, see Glue version in the
                            developer guide.

                            Jobs that are created without specifying a
                            Glue version default to Glue 0.9.

                          * **DPUSeconds** *(float) --*

                            This field can be set for either job runs
                            with execution class "FLEX" or when Auto
                            Scaling is enabled, and represents the
                            total time each executor ran during the
                            lifecycle of a job run in seconds,
                            multiplied by a DPU factor (1 for "G.1X",
                            2 for "G.2X", or 0.25 for "G.025X"
                            workers). This value may be different than
                            the "executionEngineRuntime" *
                            "MaxCapacity" as in the case of Auto
                            Scaling jobs, as the number of executors
                            running at a given time may be less than
                            the "MaxCapacity". Therefore, it is
                            possible that the value of "DPUSeconds" is
                            less than "executionEngineRuntime" *
                            "MaxCapacity".

                          * **ExecutionClass** *(string) --*

                            Indicates whether the job is run with a
                            standard or flexible execution class. The
                            standard execution-class is ideal for
                            time-sensitive workloads that require fast
                            job startup and dedicated resources.

                            The flexible execution class is
                            appropriate for time-insensitive jobs
                            whose start and completion times may vary.

                            Only jobs with Glue version 3.0 and above
                            and command type "glueetl" will be allowed
                            to set "ExecutionClass" to "FLEX". The
                            flexible execution class is available for
                            Spark jobs.

                          * **MaintenanceWindow** *(string) --*

                            This field specifies a day of the week and
                            hour for a maintenance window for
                            streaming jobs. Glue periodically performs
                            maintenance activities. During these
                            maintenance windows, Glue will need to
                            restart your streaming jobs.

                            Glue will restart the job within 3 hours
                            of the specified maintenance window. For
                            instance, if you set up the maintenance
                            window for Monday at 10:00AM GMT, your
                            jobs will be restarted between 10:00AM GMT
                            to 1:00PM GMT.

                          * **ProfileName** *(string) --*

                            The name of an Glue usage profile
                            associated with the job run.

                          * **StateDetail** *(string) --*

                            This field holds details that pertain to
                            the state of a job run. The field is
                            nullable.

                            For example, when a job run is in a
                            WAITING state as a result of job run
                            queuing, the field has the reason why the
                            job run is in that state.

                          * **ExecutionRoleSessionPolicy** *(string)
                            --*

                            This inline session policy to the
                            StartJobRun API allows you to dynamically
                            restrict the permissions of the specified
                            execution role for the scope of the job,
                            without requiring the creation of
                            additional IAM roles.

                    * **CrawlerDetails** *(dict) --*

                      Details of the crawler when the node represents
                      a crawler.

                      * **Crawls** *(list) --*

                        A list of crawls represented by the crawl
                        node.

                        * *(dict) --*

                          The details of a crawl in the workflow.

                          * **State** *(string) --*

                            The state of the crawler.

                          * **StartedOn** *(datetime) --*

                            The date and time on which the crawl
                            started.

                          * **CompletedOn** *(datetime) --*

                            The date and time on which the crawl
                            completed.

                          * **ErrorMessage** *(string) --*

                            The error message associated with the
                            crawl.

                          * **LogGroup** *(string) --*

                            The log group associated with the crawl.

                          * **LogStream** *(string) --*

                            The log stream associated with the crawl.

                * **Edges** *(list) --*

                  A list of all the directed connections between the
                  nodes belonging to the workflow.

                  * *(dict) --*

                    An edge represents a directed connection between
                    two Glue components that are part of the workflow
                    the edge belongs to.

                    * **SourceId** *(string) --*

                      The unique of the node within the workflow where
                      the edge starts.

                    * **DestinationId** *(string) --*

                      The unique of the node within the workflow where
                      the edge ends.

              * **StartingEventBatchCondition** *(dict) --*

                The batch condition that started the workflow run.

                * **BatchSize** *(integer) --*

                  Number of events in the batch.

                * **BatchWindow** *(integer) --*

                  Duration of the batch window in seconds.

            * **Graph** *(dict) --*

              The graph representing all the Glue components that
              belong to the workflow as nodes and directed connections
              between them as edges.

              * **Nodes** *(list) --*

                A list of the the Glue components belong to the
                workflow represented as nodes.

                * *(dict) --*

                  A node represents an Glue component (trigger,
                  crawler, or job) on a workflow graph.

                  * **Type** *(string) --*

                    The type of Glue component represented by the
                    node.

                  * **Name** *(string) --*

                    The name of the Glue component represented by the
                    node.

                  * **UniqueId** *(string) --*

                    The unique Id assigned to the node within the
                    workflow.

                  * **TriggerDetails** *(dict) --*

                    Details of the Trigger when the node represents a
                    Trigger.

                    * **Trigger** *(dict) --*

                      The information of the trigger represented by
                      the trigger node.

                      * **Name** *(string) --*

                        The name of the trigger.

                      * **WorkflowName** *(string) --*

                        The name of the workflow associated with the
                        trigger.

                      * **Id** *(string) --*

                        Reserved for future use.

                      * **Type** *(string) --*

                        The type of trigger that this is.

                      * **State** *(string) --*

                        The current state of the trigger.

                      * **Description** *(string) --*

                        A description of this trigger.

                      * **Schedule** *(string) --*

                        A "cron" expression used to specify the
                        schedule (see Time-Based Schedules for Jobs
                        and Crawlers. For example, to run something
                        every day at 12:15 UTC, you would specify:
                        "cron(15 12 * * ? *)".

                      * **Actions** *(list) --*

                        The actions initiated by this trigger.

                        * *(dict) --*

                          Defines an action to be initiated by a
                          trigger.

                          * **JobName** *(string) --*

                            The name of a job to be run.

                          * **Arguments** *(dict) --*

                            The job arguments used when this trigger
                            fires. For this job run, they replace the
                            default arguments set in the job
                            definition itself.

                            You can specify arguments here that your
                            own job-execution script consumes, as well
                            as arguments that Glue itself consumes.

                            For information about how to specify and
                            consume your own Job arguments, see the
                            Calling Glue APIs in Python topic in the
                            developer guide.

                            For information about the key-value pairs
                            that Glue consumes to set up your job, see
                            the Special Parameters Used by Glue topic
                            in the developer guide.

                            * *(string) --*

                              * *(string) --*

                          * **Timeout** *(integer) --*

                            The "JobRun" timeout in minutes. This is
                            the maximum time that a job run can
                            consume resources before it is terminated
                            and enters "TIMEOUT" status. This
                            overrides the timeout value set in the
                            parent job.

                            Jobs must have timeout values less than 7
                            days or 10080 minutes. Otherwise, the jobs
                            will throw an exception.

                            When the value is left blank, the timeout
                            is defaulted to 2880 minutes.

                            Any existing Glue jobs that had a timeout
                            value greater than 7 days will be
                            defaulted to 7 days. For instance if you
                            have specified a timeout of 20 days for a
                            batch job, it will be stopped on the 7th
                            day.

                            For streaming jobs, if you have set up a
                            maintenance window, it will be restarted
                            during the maintenance window after 7
                            days.

                          * **SecurityConfiguration** *(string) --*

                            The name of the "SecurityConfiguration"
                            structure to be used with this action.

                          * **NotificationProperty** *(dict) --*

                            Specifies configuration properties of a
                            job run notification.

                            * **NotifyDelayAfter** *(integer) --*

                              After a job run starts, the number of
                              minutes to wait before sending a job run
                              delay notification.

                          * **CrawlerName** *(string) --*

                            The name of the crawler to be used with
                            this action.

                      * **Predicate** *(dict) --*

                        The predicate of this trigger, which defines
                        when it will fire.

                        * **Logical** *(string) --*

                          An optional field if only one condition is
                          listed. If multiple conditions are listed,
                          then this field is required.

                        * **Conditions** *(list) --*

                          A list of the conditions that determine when
                          the trigger will fire.

                          * *(dict) --*

                            Defines a condition under which a trigger
                            fires.

                            * **LogicalOperator** *(string) --*

                              A logical operator.

                            * **JobName** *(string) --*

                              The name of the job whose "JobRuns" this
                              condition applies to, and on which this
                              trigger waits.

                            * **State** *(string) --*

                              The condition state. Currently, the only
                              job states that a trigger can listen for
                              are "SUCCEEDED", "STOPPED", "FAILED",
                              and "TIMEOUT". The only crawler states
                              that a trigger can listen for are
                              "SUCCEEDED", "FAILED", and "CANCELLED".

                            * **CrawlerName** *(string) --*

                              The name of the crawler to which this
                              condition applies.

                            * **CrawlState** *(string) --*

                              The state of the crawler to which this
                              condition applies.

                      * **EventBatchingCondition** *(dict) --*

                        Batch condition that must be met (specified
                        number of events received or batch time window
                        expired) before EventBridge event trigger
                        fires.

                        * **BatchSize** *(integer) --*

                          Number of events that must be received from
                          Amazon EventBridge before EventBridge event
                          trigger fires.

                        * **BatchWindow** *(integer) --*

                          Window of time in seconds after which
                          EventBridge event trigger fires. Window
                          starts when first event is received.

                  * **JobDetails** *(dict) --*

                    Details of the Job when the node represents a Job.

                    * **JobRuns** *(list) --*

                      The information for the job runs represented by
                      the job node.

                      * *(dict) --*

                        Contains information about a job run.

                        * **Id** *(string) --*

                          The ID of this job run.

                        * **Attempt** *(integer) --*

                          The number of the attempt to run this job.

                        * **PreviousRunId** *(string) --*

                          The ID of the previous run of this job. For
                          example, the "JobRunId" specified in the
                          "StartJobRun" action.

                        * **TriggerName** *(string) --*

                          The name of the trigger that started this
                          job run.

                        * **JobName** *(string) --*

                          The name of the job definition being used in
                          this run.

                        * **JobMode** *(string) --*

                          A mode that describes how a job was created.
                          Valid values are:

                          * "SCRIPT" - The job was created using the
                            Glue Studio script editor.

                          * "VISUAL" - The job was created using the
                            Glue Studio visual editor.

                          * "NOTEBOOK" - The job was created using an
                            interactive sessions notebook.

                          When the "JobMode" field is missing or null,
                          "SCRIPT" is assigned as the default value.

                        * **JobRunQueuingEnabled** *(boolean) --*

                          Specifies whether job run queuing is enabled
                          for the job run.

                          A value of true means job run queuing is
                          enabled for the job run. If false or not
                          populated, the job run will not be
                          considered for queueing.

                        * **StartedOn** *(datetime) --*

                          The date and time at which this job run was
                          started.

                        * **LastModifiedOn** *(datetime) --*

                          The last time that this job run was
                          modified.

                        * **CompletedOn** *(datetime) --*

                          The date and time that this job run
                          completed.

                        * **JobRunState** *(string) --*

                          The current state of the job run. For more
                          information about the statuses of jobs that
                          have terminated abnormally, see Glue Job Run
                          Statuses.

                        * **Arguments** *(dict) --*

                          The job arguments associated with this run.
                          For this job run, they replace the default
                          arguments set in the job definition itself.

                          You can specify arguments here that your own
                          job-execution script consumes, as well as
                          arguments that Glue itself consumes.

                          Job arguments may be logged. Do not pass
                          plaintext secrets as arguments. Retrieve
                          secrets from a Glue Connection, Secrets
                          Manager or other secret management mechanism
                          if you intend to keep them within the Job.

                          For information about how to specify and
                          consume your own Job arguments, see the
                          Calling Glue APIs in Python topic in the
                          developer guide.

                          For information about the arguments you can
                          provide to this field when configuring Spark
                          jobs, see the Special Parameters Used by
                          Glue topic in the developer guide.

                          For information about the arguments you can
                          provide to this field when configuring Ray
                          jobs, see Using job parameters in Ray jobs
                          in the developer guide.

                          * *(string) --*

                            * *(string) --*

                        * **ErrorMessage** *(string) --*

                          An error message associated with this job
                          run.

                        * **PredecessorRuns** *(list) --*

                          A list of predecessors to this job run.

                          * *(dict) --*

                            A job run that was used in the predicate
                            of a conditional trigger that triggered
                            this job run.

                            * **JobName** *(string) --*

                              The name of the job definition used by
                              the predecessor job run.

                            * **RunId** *(string) --*

                              The job-run ID of the predecessor job
                              run.

                        * **AllocatedCapacity** *(integer) --*

                          This field is deprecated. Use "MaxCapacity"
                          instead.

                          The number of Glue data processing units
                          (DPUs) allocated to this JobRun. From 2 to
                          100 DPUs can be allocated; the default is
                          10. A DPU is a relative measure of
                          processing power that consists of 4 vCPUs of
                          compute capacity and 16 GB of memory. For
                          more information, see the Glue pricing page.

                        * **ExecutionTime** *(integer) --*

                          The amount of time (in seconds) that the job
                          run consumed resources.

                        * **Timeout** *(integer) --*

                          The "JobRun" timeout in minutes. This is the
                          maximum time that a job run can consume
                          resources before it is terminated and enters
                          "TIMEOUT" status. This value overrides the
                          timeout value set in the parent job.

                          Jobs must have timeout values less than 7
                          days or 10080 minutes. Otherwise, the jobs
                          will throw an exception.

                          When the value is left blank, the timeout is
                          defaulted to 2880 minutes.

                          Any existing Glue jobs that had a timeout
                          value greater than 7 days will be defaulted
                          to 7 days. For instance if you have
                          specified a timeout of 20 days for a batch
                          job, it will be stopped on the 7th day.

                          For streaming jobs, if you have set up a
                          maintenance window, it will be restarted
                          during the maintenance window after 7 days.

                        * **MaxCapacity** *(float) --*

                          For Glue version 1.0 or earlier jobs, using
                          the standard worker type, the number of Glue
                          data processing units (DPUs) that can be
                          allocated when this job runs. A DPU is a
                          relative measure of processing power that
                          consists of 4 vCPUs of compute capacity and
                          16 GB of memory. For more information, see
                          the Glue pricing page.

                          For Glue version 2.0+ jobs, you cannot
                          specify a "Maximum capacity". Instead, you
                          should specify a "Worker type" and the
                          "Number of workers".

                          Do not set "MaxCapacity" if using
                          "WorkerType" and "NumberOfWorkers".

                          The value that can be allocated for
                          "MaxCapacity" depends on whether you are
                          running a Python shell job, an Apache Spark
                          ETL job, or an Apache Spark streaming ETL
                          job:

                          * When you specify a Python shell job (
                            >>``<<JobCommand.Name``="pythonshell"),
                            you can allocate either 0.0625 or 1 DPU.
                            The default is 0.0625 DPU.

                          * When you specify an Apache Spark ETL job (
                            >>``<<JobCommand.Name``="glueetl") or
                            Apache Spark streaming ETL job (
                            >>``<<JobCommand.Name``="gluestreaming"),
                            you can allocate from 2 to 100 DPUs. The
                            default is 10 DPUs. This job type cannot
                            have a fractional DPU allocation.

                        * **WorkerType** *(string) --*

                          The type of predefined worker that is
                          allocated when a job runs. Accepts a value
                          of G.1X, G.2X, G.4X, G.8X or G.025X for
                          Spark jobs. Accepts the value Z.2X for Ray
                          jobs.

                          * For the "G.1X" worker type, each worker
                            maps to 1 DPU (4 vCPUs, 16 GB of memory)
                            with 94GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for workloads such as data transforms,
                            joins, and queries, to offers a scalable
                            and cost effective way to run most jobs.

                          * For the "G.2X" worker type, each worker
                            maps to 2 DPU (8 vCPUs, 32 GB of memory)
                            with 138GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for workloads such as data transforms,
                            joins, and queries, to offers a scalable
                            and cost effective way to run most jobs.

                          * For the "G.4X" worker type, each worker
                            maps to 4 DPU (16 vCPUs, 64 GB of memory)
                            with 256GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for jobs whose workloads contain your most
                            demanding transforms, aggregations, joins,
                            and queries. This worker type is available
                            only for Glue version 3.0 or later Spark
                            ETL jobs in the following Amazon Web
                            Services Regions: US East (Ohio), US East
                            (N. Virginia), US West (Oregon), Asia
                            Pacific (Singapore), Asia Pacific
                            (Sydney), Asia Pacific (Tokyo), Canada
                            (Central), Europe (Frankfurt), Europe
                            (Ireland), and Europe (Stockholm).

                          * For the "G.8X" worker type, each worker
                            maps to 8 DPU (32 vCPUs, 128 GB of memory)
                            with 512GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for jobs whose workloads contain your most
                            demanding transforms, aggregations, joins,
                            and queries. This worker type is available
                            only for Glue version 3.0 or later Spark
                            ETL jobs, in the same Amazon Web Services
                            Regions as supported for the "G.4X" worker
                            type.

                          * For the "G.025X" worker type, each worker
                            maps to 0.25 DPU (2 vCPUs, 4 GB of memory)
                            with 84GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for low volume streaming jobs. This worker
                            type is only available for Glue version
                            3.0 or later streaming jobs.

                          * For the "Z.2X" worker type, each worker
                            maps to 2 M-DPU (8vCPUs, 64 GB of memory)
                            with 128 GB disk, and provides up to 8 Ray
                            workers based on the autoscaler.

                        * **NumberOfWorkers** *(integer) --*

                          The number of workers of a defined
                          "workerType" that are allocated when a job
                          runs.

                        * **SecurityConfiguration** *(string) --*

                          The name of the "SecurityConfiguration"
                          structure to be used with this job run.

                        * **LogGroupName** *(string) --*

                          The name of the log group for secure logging
                          that can be server-side encrypted in Amazon
                          CloudWatch using KMS. This name can be
                          "/aws-glue/jobs/", in which case the default
                          encryption is "NONE". If you add a role name
                          and "SecurityConfiguration" name (in other
                          words, "/aws-glue/jobs-yourRoleName-
                          yourSecurityConfigurationName/"), then that
                          security configuration is used to encrypt
                          the log group.

                        * **NotificationProperty** *(dict) --*

                          Specifies configuration properties of a job
                          run notification.

                          * **NotifyDelayAfter** *(integer) --*

                            After a job run starts, the number of
                            minutes to wait before sending a job run
                            delay notification.

                        * **GlueVersion** *(string) --*

                          In Spark jobs, "GlueVersion" determines the
                          versions of Apache Spark and Python that
                          Glue available in a job. The Python version
                          indicates the version supported for jobs of
                          type Spark.

                          Ray jobs should set "GlueVersion" to "4.0"
                          or greater. However, the versions of Ray,
                          Python and additional libraries available in
                          your Ray job are determined by the "Runtime"
                          parameter of the Job command.

                          For more information about the available
                          Glue versions and corresponding Spark and
                          Python versions, see Glue version in the
                          developer guide.

                          Jobs that are created without specifying a
                          Glue version default to Glue 0.9.

                        * **DPUSeconds** *(float) --*

                          This field can be set for either job runs
                          with execution class "FLEX" or when Auto
                          Scaling is enabled, and represents the total
                          time each executor ran during the lifecycle
                          of a job run in seconds, multiplied by a DPU
                          factor (1 for "G.1X", 2 for "G.2X", or 0.25
                          for "G.025X" workers). This value may be
                          different than the "executionEngineRuntime"
                          * "MaxCapacity" as in the case of Auto
                          Scaling jobs, as the number of executors
                          running at a given time may be less than the
                          "MaxCapacity". Therefore, it is possible
                          that the value of "DPUSeconds" is less than
                          "executionEngineRuntime" * "MaxCapacity".

                        * **ExecutionClass** *(string) --*

                          Indicates whether the job is run with a
                          standard or flexible execution class. The
                          standard execution-class is ideal for time-
                          sensitive workloads that require fast job
                          startup and dedicated resources.

                          The flexible execution class is appropriate
                          for time-insensitive jobs whose start and
                          completion times may vary.

                          Only jobs with Glue version 3.0 and above
                          and command type "glueetl" will be allowed
                          to set "ExecutionClass" to "FLEX". The
                          flexible execution class is available for
                          Spark jobs.

                        * **MaintenanceWindow** *(string) --*

                          This field specifies a day of the week and
                          hour for a maintenance window for streaming
                          jobs. Glue periodically performs maintenance
                          activities. During these maintenance
                          windows, Glue will need to restart your
                          streaming jobs.

                          Glue will restart the job within 3 hours of
                          the specified maintenance window. For
                          instance, if you set up the maintenance
                          window for Monday at 10:00AM GMT, your jobs
                          will be restarted between 10:00AM GMT to
                          1:00PM GMT.

                        * **ProfileName** *(string) --*

                          The name of an Glue usage profile associated
                          with the job run.

                        * **StateDetail** *(string) --*

                          This field holds details that pertain to the
                          state of a job run. The field is nullable.

                          For example, when a job run is in a WAITING
                          state as a result of job run queuing, the
                          field has the reason why the job run is in
                          that state.

                        * **ExecutionRoleSessionPolicy** *(string) --*

                          This inline session policy to the
                          StartJobRun API allows you to dynamically
                          restrict the permissions of the specified
                          execution role for the scope of the job,
                          without requiring the creation of additional
                          IAM roles.

                  * **CrawlerDetails** *(dict) --*

                    Details of the crawler when the node represents a
                    crawler.

                    * **Crawls** *(list) --*

                      A list of crawls represented by the crawl node.

                      * *(dict) --*

                        The details of a crawl in the workflow.

                        * **State** *(string) --*

                          The state of the crawler.

                        * **StartedOn** *(datetime) --*

                          The date and time on which the crawl
                          started.

                        * **CompletedOn** *(datetime) --*

                          The date and time on which the crawl
                          completed.

                        * **ErrorMessage** *(string) --*

                          The error message associated with the crawl.

                        * **LogGroup** *(string) --*

                          The log group associated with the crawl.

                        * **LogStream** *(string) --*

                          The log stream associated with the crawl.

              * **Edges** *(list) --*

                A list of all the directed connections between the
                nodes belonging to the workflow.

                * *(dict) --*

                  An edge represents a directed connection between two
                  Glue components that are part of the workflow the
                  edge belongs to.

                  * **SourceId** *(string) --*

                    The unique of the node within the workflow where
                    the edge starts.

                  * **DestinationId** *(string) --*

                    The unique of the node within the workflow where
                    the edge ends.

            * **MaxConcurrentRuns** *(integer) --*

              You can use this parameter to prevent unwanted multiple
              updates to data, to control costs, or in some cases, to
              prevent exceeding the maximum number of concurrent runs
              of any of the component jobs. If you leave this
              parameter blank, there is no limit to the number of
              concurrent workflow runs.

            * **BlueprintDetails** *(dict) --*

              This structure indicates the details of the blueprint
              that this particular workflow is created from.

              * **BlueprintName** *(string) --*

                The name of the blueprint.

              * **RunId** *(string) --*

                The run ID for this blueprint.

        * **MissingWorkflows** *(list) --*

          A list of names of workflows not found.

          * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / resume_workflow_run


resume_workflow_run
*******************

Glue.Client.resume_workflow_run(**kwargs)

   Restarts selected nodes of a previous partially completed workflow
   run and resumes the workflow run. The selected nodes and all nodes
   that are downstream from the selected nodes are run.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.resume_workflow_run(
          Name='string',
          RunId='string',
          NodeIds=[
              'string',
          ]
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the workflow to resume.

      * **RunId** (*string*) --

        **[REQUIRED]**

        The ID of the workflow run to resume.

      * **NodeIds** (*list*) --

        **[REQUIRED]**

        A list of the node IDs for the nodes you want to restart. The
        nodes that are to be restarted must have a run attempt in the
        original run.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RunId': 'string',
             'NodeIds': [
                 'string',
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **RunId** *(string) --*

          The new ID assigned to the resumed workflow run. Each resume
          of a workflow run will have a new run ID.

        * **NodeIds** *(list) --*

          A list of the node IDs for the nodes that were actually
          restarted.

          * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentRunsExceededException"

   * "Glue.Client.exceptions.IllegalWorkflowStateException"
Glue / Client / reset_job_bookmark


reset_job_bookmark
******************

Glue.Client.reset_job_bookmark(**kwargs)

   Resets a bookmark entry.

   For more information about enabling and using job bookmarks, see:

   * Tracking processed data using job bookmarks

   * Job parameters used by Glue

   * Job structure

   See also: AWS API Documentation

   **Request Syntax**

      response = client.reset_job_bookmark(
          JobName='string',
          RunId='string'
      )

   Parameters:
      * **JobName** (*string*) --

        **[REQUIRED]**

        The name of the job in question.

      * **RunId** (*string*) -- The unique run identifier associated
        with this job run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobBookmarkEntry': {
                 'JobName': 'string',
                 'Version': 123,
                 'Run': 123,
                 'Attempt': 123,
                 'PreviousRunId': 'string',
                 'RunId': 'string',
                 'JobBookmark': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **JobBookmarkEntry** *(dict) --*

          The reset bookmark entry.

          * **JobName** *(string) --*

            The name of the job in question.

          * **Version** *(integer) --*

            The version of the job.

          * **Run** *(integer) --*

            The run ID number.

          * **Attempt** *(integer) --*

            The attempt ID number.

          * **PreviousRunId** *(string) --*

            The unique run identifier associated with the previous job
            run.

          * **RunId** *(string) --*

            The run ID number.

          * **JobBookmark** *(string) --*

            The bookmark itself.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / update_registry


update_registry
***************

Glue.Client.update_registry(**kwargs)

   Updates an existing registry which is used to hold a collection of
   schemas. The updated properties relate to the registry, and do not
   modify any of the schemas within the registry.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_registry(
          RegistryId={
              'RegistryName': 'string',
              'RegistryArn': 'string'
          },
          Description='string'
      )

   Parameters:
      * **RegistryId** (*dict*) --

        **[REQUIRED]**

        This is a wrapper structure that may contain the registry name
        and Amazon Resource Name (ARN).

        * **RegistryName** *(string) --*

          Name of the registry. Used only for lookup. One of
          "RegistryArn" or "RegistryName" has to be provided.

        * **RegistryArn** *(string) --*

          Arn of the registry to be updated. One of "RegistryArn" or
          "RegistryName" has to be provided.

      * **Description** (*string*) --

        **[REQUIRED]**

        A description of the registry. If description is not provided,
        this field will not be updated.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RegistryName': 'string',
             'RegistryArn': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **RegistryName** *(string) --*

          The name of the updated registry.

        * **RegistryArn** *(string) --*

          The Amazon Resource name (ARN) of the updated registry.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / list_jobs


list_jobs
*********

Glue.Client.list_jobs(**kwargs)

   Retrieves the names of all job resources in this Amazon Web
   Services account, or the resources with the specified tag. This
   operation allows you to see which resources are available in your
   account, and their names.

   This operation takes the optional "Tags" field, which you can use
   as a filter on the response so that tagged resources can be
   retrieved as a group. If you choose to use tags filtering, only
   resources with the tag are retrieved.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_jobs(
          NextToken='string',
          MaxResults=123,
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

      * **Tags** (*dict*) --

        Specifies to return only these tagged resources.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobNames': [
                 'string',
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **JobNames** *(list) --*

          The names of all jobs in the account, or the jobs with the
          specified tags.

          * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if the returned list does not contain
          the last metric available.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / list_triggers


list_triggers
*************

Glue.Client.list_triggers(**kwargs)

   Retrieves the names of all trigger resources in this Amazon Web
   Services account, or the resources with the specified tag. This
   operation allows you to see which resources are available in your
   account, and their names.

   This operation takes the optional "Tags" field, which you can use
   as a filter on the response so that tagged resources can be
   retrieved as a group. If you choose to use tags filtering, only
   resources with the tag are retrieved.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_triggers(
          NextToken='string',
          DependentJobName='string',
          MaxResults=123,
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

      * **DependentJobName** (*string*) -- The name of the job for
        which to retrieve triggers. The trigger that can start this
        job is returned. If there is no such trigger, all triggers are
        returned.

      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

      * **Tags** (*dict*) --

        Specifies to return only these tagged resources.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TriggerNames': [
                 'string',
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TriggerNames** *(list) --*

          The names of all triggers in the account, or the triggers
          with the specified tags.

          * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if the returned list does not contain
          the last metric available.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / list_schema_versions


list_schema_versions
********************

Glue.Client.list_schema_versions(**kwargs)

   Returns a list of schema versions that you have created, with
   minimal information. Schema versions in Deleted status will not be
   included in the results. Empty results will be returned if there
   are no schema versions available.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_schema_versions(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **SchemaId** (*dict*) --

        **[REQUIRED]**

        This is a wrapper structure to contain schema identity fields.
        The structure contains:

        * SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the
          schema. Either "SchemaArn" or "SchemaName" and
          "RegistryName" has to be provided.

        * SchemaId$SchemaName: The name of the schema. Either
          "SchemaArn" or "SchemaName" and "RegistryName" has to be
          provided.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **MaxResults** (*integer*) -- Maximum number of results
        required per page. If the value is not supplied, this will be
        defaulted to 25 per page.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Schemas': [
                 {
                     'SchemaArn': 'string',
                     'SchemaVersionId': 'string',
                     'VersionNumber': 123,
                     'Status': 'AVAILABLE'|'PENDING'|'FAILURE'|'DELETING',
                     'CreatedTime': 'string'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Schemas** *(list) --*

          An array of "SchemaVersionList" objects containing details
          of each schema version.

          * *(dict) --*

            An object containing the details about a schema version.

            * **SchemaArn** *(string) --*

              The Amazon Resource Name (ARN) of the schema.

            * **SchemaVersionId** *(string) --*

              The unique identifier of the schema version.

            * **VersionNumber** *(integer) --*

              The version number of the schema.

            * **Status** *(string) --*

              The status of the schema version.

            * **CreatedTime** *(string) --*

              The date and time the schema version was created.

        * **NextToken** *(string) --*

          A continuation token for paginating the returned list of
          tokens, returned if the current segment of the list is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / get_partitions


get_partitions
**************

Glue.Client.get_partitions(**kwargs)

   Retrieves information about the partitions in a table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_partitions(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          Expression='string',
          NextToken='string',
          Segment={
              'SegmentNumber': 123,
              'TotalSegments': 123
          },
          MaxResults=123,
          ExcludeColumnSchema=True|False,
          TransactionId='string',
          QueryAsOfTime=datetime(2015, 1, 1)
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partitions in question reside. If none is provided, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partitions reside.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the partitions' table.

      * **Expression** (*string*) --

        An expression that filters the partitions to be returned.

        The expression uses SQL syntax similar to the SQL "WHERE"
        filter clause. The SQL statement parser JSQLParser parses the
        expression.

        *Operators*: The following are the operators that you can use
        in the "Expression" API call:

           =

        Checks whether the values of the two operands are equal; if
        yes, then the condition becomes true.

        Example: Assume 'variable a' holds 10 and 'variable b' holds
        20.

        (a = b) is not true.

           < >

        Checks whether the values of two operands are equal; if the
        values are not equal, then the condition becomes true.

        Example: (a < > b) is true.

           >

        Checks whether the value of the left operand is greater than
        the value of the right operand; if yes, then the condition
        becomes true.

        Example: (a > b) is not true.

           <

        Checks whether the value of the left operand is less than the
        value of the right operand; if yes, then the condition becomes
        true.

        Example: (a < b) is true.

           >=

        Checks whether the value of the left operand is greater than
        or equal to the value of the right operand; if yes, then the
        condition becomes true.

        Example: (a >= b) is not true.

           <=

        Checks whether the value of the left operand is less than or
        equal to the value of the right operand; if yes, then the
        condition becomes true.

        Example: (a <= b) is true.

           AND, OR, IN, BETWEEN, LIKE, NOT, IS NULL

        Logical operators.

        *Supported Partition Key Types*: The following are the
        supported partition keys.

        * "string"

        * "date"

        * "timestamp"

        * "int"

        * "bigint"

        * "long"

        * "tinyint"

        * "smallint"

        * "decimal"

        If an type is encountered that is not valid, an exception is
        thrown.

        The following list shows the valid operators on each type.
        When you define a crawler, the "partitionKey" type is created
        as a "STRING", to be compatible with the catalog partitions.

        *Sample API Call*:

      * **NextToken** (*string*) -- A continuation token, if this is
        not the first call to retrieve these partitions.

      * **Segment** (*dict*) --

        The segment of the table's partitions to scan in this request.

        * **SegmentNumber** *(integer) --* **[REQUIRED]**

          The zero-based index number of the segment. For example, if
          the total number of segments is 4, "SegmentNumber" values
          range from 0 through 3.

        * **TotalSegments** *(integer) --* **[REQUIRED]**

          The total number of segments.

      * **MaxResults** (*integer*) -- The maximum number of partitions
        to return in a single response.

      * **ExcludeColumnSchema** (*boolean*) -- When true, specifies
        not returning the partition column schema. Useful when you are
        interested only in other partition attributes such as
        partition values or location. This approach avoids the problem
        of a large response by not returning duplicate data.

      * **TransactionId** (*string*) -- The transaction ID at which to
        read the partition contents.

      * **QueryAsOfTime** (*datetime*) -- The time as of when to read
        the partition contents. If not set, the most recent
        transaction commit time will be used. Cannot be specified
        along with "TransactionId".

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Partitions': [
                 {
                     'Values': [
                         'string',
                     ],
                     'DatabaseName': 'string',
                     'TableName': 'string',
                     'CreationTime': datetime(2015, 1, 1),
                     'LastAccessTime': datetime(2015, 1, 1),
                     'StorageDescriptor': {
                         'Columns': [
                             {
                                 'Name': 'string',
                                 'Type': 'string',
                                 'Comment': 'string',
                                 'Parameters': {
                                     'string': 'string'
                                 }
                             },
                         ],
                         'Location': 'string',
                         'AdditionalLocations': [
                             'string',
                         ],
                         'InputFormat': 'string',
                         'OutputFormat': 'string',
                         'Compressed': True|False,
                         'NumberOfBuckets': 123,
                         'SerdeInfo': {
                             'Name': 'string',
                             'SerializationLibrary': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                         'BucketColumns': [
                             'string',
                         ],
                         'SortColumns': [
                             {
                                 'Column': 'string',
                                 'SortOrder': 123
                             },
                         ],
                         'Parameters': {
                             'string': 'string'
                         },
                         'SkewedInfo': {
                             'SkewedColumnNames': [
                                 'string',
                             ],
                             'SkewedColumnValues': [
                                 'string',
                             ],
                             'SkewedColumnValueLocationMaps': {
                                 'string': 'string'
                             }
                         },
                         'StoredAsSubDirectories': True|False,
                         'SchemaReference': {
                             'SchemaId': {
                                 'SchemaArn': 'string',
                                 'SchemaName': 'string',
                                 'RegistryName': 'string'
                             },
                             'SchemaVersionId': 'string',
                             'SchemaVersionNumber': 123
                         }
                     },
                     'Parameters': {
                         'string': 'string'
                     },
                     'LastAnalyzedTime': datetime(2015, 1, 1),
                     'CatalogId': 'string'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Partitions** *(list) --*

          A list of requested partitions.

          * *(dict) --*

            Represents a slice of table data.

            * **Values** *(list) --*

              The values of the partition.

              * *(string) --*

            * **DatabaseName** *(string) --*

              The name of the catalog database in which to create the
              partition.

            * **TableName** *(string) --*

              The name of the database table in which to create the
              partition.

            * **CreationTime** *(datetime) --*

              The time at which the partition was created.

            * **LastAccessTime** *(datetime) --*

              The last time at which the partition was accessed.

            * **StorageDescriptor** *(dict) --*

              Provides information about the physical location where
              the partition is stored.

              * **Columns** *(list) --*

                A list of the "Columns" in the table.

                * *(dict) --*

                  A column in a "Table".

                  * **Name** *(string) --*

                    The name of the "Column".

                  * **Type** *(string) --*

                    The data type of the "Column".

                  * **Comment** *(string) --*

                    A free-form text comment.

                  * **Parameters** *(dict) --*

                    These key-value pairs define properties associated
                    with the column.

                    * *(string) --*

                      * *(string) --*

              * **Location** *(string) --*

                The physical location of the table. By default, this
                takes the form of the warehouse location, followed by
                the database location in the warehouse, followed by
                the table name.

              * **AdditionalLocations** *(list) --*

                A list of locations that point to the path where a
                Delta table is located.

                * *(string) --*

              * **InputFormat** *(string) --*

                The input format: "SequenceFileInputFormat" (binary),
                or "TextInputFormat", or a custom format.

              * **OutputFormat** *(string) --*

                The output format: "SequenceFileOutputFormat"
                (binary), or "IgnoreKeyTextOutputFormat", or a custom
                format.

              * **Compressed** *(boolean) --*

                "True" if the data in the table is compressed, or
                "False" if not.

              * **NumberOfBuckets** *(integer) --*

                Must be specified if the table contains any dimension
                columns.

              * **SerdeInfo** *(dict) --*

                The serialization/deserialization (SerDe) information.

                * **Name** *(string) --*

                  Name of the SerDe.

                * **SerializationLibrary** *(string) --*

                  Usually the class that implements the SerDe. An
                  example is "org.apache.hadoop.hive.serde2.columnar.
                  ColumnarSerDe".

                * **Parameters** *(dict) --*

                  These key-value pairs define initialization
                  parameters for the SerDe.

                  * *(string) --*

                    * *(string) --*

              * **BucketColumns** *(list) --*

                A list of reducer grouping columns, clustering
                columns, and bucketing columns in the table.

                * *(string) --*

              * **SortColumns** *(list) --*

                A list specifying the sort order of each bucket in the
                table.

                * *(dict) --*

                  Specifies the sort order of a sorted column.

                  * **Column** *(string) --*

                    The name of the column.

                  * **SortOrder** *(integer) --*

                    Indicates that the column is sorted in ascending
                    order ( "== 1"), or in descending order ( "==0").

              * **Parameters** *(dict) --*

                The user-supplied properties in key-value form.

                * *(string) --*

                  * *(string) --*

              * **SkewedInfo** *(dict) --*

                The information about values that appear frequently in
                a column (skewed values).

                * **SkewedColumnNames** *(list) --*

                  A list of names of columns that contain skewed
                  values.

                  * *(string) --*

                * **SkewedColumnValues** *(list) --*

                  A list of values that appear so frequently as to be
                  considered skewed.

                  * *(string) --*

                * **SkewedColumnValueLocationMaps** *(dict) --*

                  A mapping of skewed values to the columns that
                  contain them.

                  * *(string) --*

                    * *(string) --*

              * **StoredAsSubDirectories** *(boolean) --*

                "True" if the table data is stored in subdirectories,
                or "False" if not.

              * **SchemaReference** *(dict) --*

                An object that references a schema stored in the Glue
                Schema Registry.

                When creating a table, you can pass an empty list of
                columns for the schema, and instead use a schema
                reference.

                * **SchemaId** *(dict) --*

                  A structure that contains schema identity fields.
                  Either this or the "SchemaVersionId" has to be
                  provided.

                  * **SchemaArn** *(string) --*

                    The Amazon Resource Name (ARN) of the schema. One
                    of "SchemaArn" or "SchemaName" has to be provided.

                  * **SchemaName** *(string) --*

                    The name of the schema. One of "SchemaArn" or
                    "SchemaName" has to be provided.

                  * **RegistryName** *(string) --*

                    The name of the schema registry that contains the
                    schema.

                * **SchemaVersionId** *(string) --*

                  The unique ID assigned to a version of the schema.
                  Either this or the "SchemaId" has to be provided.

                * **SchemaVersionNumber** *(integer) --*

                  The version number of the schema.

            * **Parameters** *(dict) --*

              These key-value pairs define partition parameters.

              * *(string) --*

                * *(string) --*

            * **LastAnalyzedTime** *(datetime) --*

              The last time at which column statistics were computed
              for this partition.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the partition
              resides.

        * **NextToken** *(string) --*

          A continuation token, if the returned list of partitions
          does not include the last one.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.InvalidStateException"

   * "Glue.Client.exceptions.ResourceNotReadyException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / list_data_quality_results


list_data_quality_results
*************************

Glue.Client.list_data_quality_results(**kwargs)

   Returns all data quality execution results for your account.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_data_quality_results(
          Filter={
              'DataSource': {
                  'GlueTable': {
                      'DatabaseName': 'string',
                      'TableName': 'string',
                      'CatalogId': 'string',
                      'ConnectionName': 'string',
                      'AdditionalOptions': {
                          'string': 'string'
                      }
                  }
              },
              'JobName': 'string',
              'JobRunId': 'string',
              'StartedAfter': datetime(2015, 1, 1),
              'StartedBefore': datetime(2015, 1, 1)
          },
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **Filter** (*dict*) --

        The filter criteria.

        * **DataSource** *(dict) --*

          Filter results by the specified data source. For example,
          retrieving all results for an Glue table.

          * **GlueTable** *(dict) --* **[REQUIRED]**

            An Glue table.

            * **DatabaseName** *(string) --* **[REQUIRED]**

              A database name in the Glue Data Catalog.

            * **TableName** *(string) --* **[REQUIRED]**

              A table name in the Glue Data Catalog.

            * **CatalogId** *(string) --*

              A unique identifier for the Glue Data Catalog.

            * **ConnectionName** *(string) --*

              The name of the connection to the Glue Data Catalog.

            * **AdditionalOptions** *(dict) --*

              Additional options for the table. Currently there are
              two keys supported:

              * "pushDownPredicate": to filter on partitions without
                having to list and read all the files in your dataset.

              * "catalogPartitionPredicate": to use server-side
                partition pruning using partition indexes in the Glue
                Data Catalog.

              * *(string) --*

                * *(string) --*

        * **JobName** *(string) --*

          Filter results by the specified job name.

        * **JobRunId** *(string) --*

          Filter results by the specified job run ID.

        * **StartedAfter** *(datetime) --*

          Filter results by runs that started after this time.

        * **StartedBefore** *(datetime) --*

          Filter results by runs that started before this time.

      * **NextToken** (*string*) -- A paginated token to offset the
        results.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Results': [
                 {
                     'ResultId': 'string',
                     'DataSource': {
                         'GlueTable': {
                             'DatabaseName': 'string',
                             'TableName': 'string',
                             'CatalogId': 'string',
                             'ConnectionName': 'string',
                             'AdditionalOptions': {
                                 'string': 'string'
                             }
                         }
                     },
                     'JobName': 'string',
                     'JobRunId': 'string',
                     'StartedOn': datetime(2015, 1, 1)
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Results** *(list) --*

          A list of "DataQualityResultDescription" objects.

          * *(dict) --*

            Describes a data quality result.

            * **ResultId** *(string) --*

              The unique result ID for this data quality result.

            * **DataSource** *(dict) --*

              The table name associated with the data quality result.

              * **GlueTable** *(dict) --*

                An Glue table.

                * **DatabaseName** *(string) --*

                  A database name in the Glue Data Catalog.

                * **TableName** *(string) --*

                  A table name in the Glue Data Catalog.

                * **CatalogId** *(string) --*

                  A unique identifier for the Glue Data Catalog.

                * **ConnectionName** *(string) --*

                  The name of the connection to the Glue Data Catalog.

                * **AdditionalOptions** *(dict) --*

                  Additional options for the table. Currently there
                  are two keys supported:

                  * "pushDownPredicate": to filter on partitions
                    without having to list and read all the files in
                    your dataset.

                  * "catalogPartitionPredicate": to use server-side
                    partition pruning using partition indexes in the
                    Glue Data Catalog.

                  * *(string) --*

                    * *(string) --*

            * **JobName** *(string) --*

              The job name associated with the data quality result.

            * **JobRunId** *(string) --*

              The job run ID associated with the data quality result.

            * **StartedOn** *(datetime) --*

              The time that the run started for this data quality
              result.

        * **NextToken** *(string) --*

          A pagination token, if more results are available.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / delete_crawler


delete_crawler
**************

Glue.Client.delete_crawler(**kwargs)

   Removes a specified crawler from the Glue Data Catalog, unless the
   crawler state is "RUNNING".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_crawler(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the crawler to remove.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.CrawlerRunningException"

   * "Glue.Client.exceptions.SchedulerTransitioningException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_blueprint_run


get_blueprint_run
*****************

Glue.Client.get_blueprint_run(**kwargs)

   Retrieves the details of a blueprint run.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_blueprint_run(
          BlueprintName='string',
          RunId='string'
      )

   Parameters:
      * **BlueprintName** (*string*) --

        **[REQUIRED]**

        The name of the blueprint.

      * **RunId** (*string*) --

        **[REQUIRED]**

        The run ID for the blueprint run you want to retrieve.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'BlueprintRun': {
                 'BlueprintName': 'string',
                 'RunId': 'string',
                 'WorkflowName': 'string',
                 'State': 'RUNNING'|'SUCCEEDED'|'FAILED'|'ROLLING_BACK',
                 'StartedOn': datetime(2015, 1, 1),
                 'CompletedOn': datetime(2015, 1, 1),
                 'ErrorMessage': 'string',
                 'RollbackErrorMessage': 'string',
                 'Parameters': 'string',
                 'RoleArn': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **BlueprintRun** *(dict) --*

          Returns a "BlueprintRun" object.

          * **BlueprintName** *(string) --*

            The name of the blueprint.

          * **RunId** *(string) --*

            The run ID for this blueprint run.

          * **WorkflowName** *(string) --*

            The name of a workflow that is created as a result of a
            successful blueprint run. If a blueprint run has an error,
            there will not be a workflow created.

          * **State** *(string) --*

            The state of the blueprint run. Possible values are:

            * Running — The blueprint run is in progress.

            * Succeeded — The blueprint run completed successfully.

            * Failed — The blueprint run failed and rollback is
              complete.

            * Rolling Back — The blueprint run failed and rollback is
              in progress.

          * **StartedOn** *(datetime) --*

            The date and time that the blueprint run started.

          * **CompletedOn** *(datetime) --*

            The date and time that the blueprint run completed.

          * **ErrorMessage** *(string) --*

            Indicates any errors that are seen while running the
            blueprint.

          * **RollbackErrorMessage** *(string) --*

            If there are any errors while creating the entities of a
            workflow, we try to roll back the created entities until
            that point and delete them. This attribute indicates the
            errors seen while trying to delete the entities that are
            created.

          * **Parameters** *(string) --*

            The blueprint parameters as a string. You will have to
            provide a value for each key that is required from the
            parameter spec that is defined in the
            "Blueprint$ParameterSpec".

          * **RoleArn** *(string) --*

            The role ARN. This role will be assumed by the Glue
            service and will be used to create the workflow and other
            entities of a workflow.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / update_column_statistics_for_table


update_column_statistics_for_table
**********************************

Glue.Client.update_column_statistics_for_table(**kwargs)

   Creates or updates table statistics of columns.

   The Identity and Access Management (IAM) permission required for
   this operation is "UpdateTable".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_column_statistics_for_table(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          ColumnStatisticsList=[
              {
                  'ColumnName': 'string',
                  'ColumnType': 'string',
                  'AnalyzedTime': datetime(2015, 1, 1),
                  'StatisticsData': {
                      'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                      'BooleanColumnStatisticsData': {
                          'NumberOfTrues': 123,
                          'NumberOfFalses': 123,
                          'NumberOfNulls': 123
                      },
                      'DateColumnStatisticsData': {
                          'MinimumValue': datetime(2015, 1, 1),
                          'MaximumValue': datetime(2015, 1, 1),
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'DecimalColumnStatisticsData': {
                          'MinimumValue': {
                              'UnscaledValue': b'bytes',
                              'Scale': 123
                          },
                          'MaximumValue': {
                              'UnscaledValue': b'bytes',
                              'Scale': 123
                          },
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'DoubleColumnStatisticsData': {
                          'MinimumValue': 123.0,
                          'MaximumValue': 123.0,
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'LongColumnStatisticsData': {
                          'MinimumValue': 123,
                          'MaximumValue': 123,
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'StringColumnStatisticsData': {
                          'MaximumLength': 123,
                          'AverageLength': 123.0,
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'BinaryColumnStatisticsData': {
                          'MaximumLength': 123,
                          'AverageLength': 123.0,
                          'NumberOfNulls': 123
                      }
                  }
              },
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partitions in question reside. If none is supplied, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partitions reside.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the partitions' table.

      * **ColumnStatisticsList** (*list*) --

        **[REQUIRED]**

        A list of the column statistics.

        * *(dict) --*

          Represents the generated column-level statistics for a table
          or partition.

          * **ColumnName** *(string) --* **[REQUIRED]**

            Name of column which statistics belong to.

          * **ColumnType** *(string) --* **[REQUIRED]**

            The data type of the column.

          * **AnalyzedTime** *(datetime) --* **[REQUIRED]**

            The timestamp of when column statistics were generated.

          * **StatisticsData** *(dict) --* **[REQUIRED]**

            A "ColumnStatisticData" object that contains the
            statistics data values.

            * **Type** *(string) --* **[REQUIRED]**

              The type of column statistics data.

            * **BooleanColumnStatisticsData** *(dict) --*

              Boolean column statistics data.

              * **NumberOfTrues** *(integer) --* **[REQUIRED]**

                The number of true values in the column.

              * **NumberOfFalses** *(integer) --* **[REQUIRED]**

                The number of false values in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

            * **DateColumnStatisticsData** *(dict) --*

              Date column statistics data.

              * **MinimumValue** *(datetime) --*

                The lowest value in the column.

              * **MaximumValue** *(datetime) --*

                The highest value in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **DecimalColumnStatisticsData** *(dict) --*

              Decimal column statistics data. UnscaledValues within
              are Base64-encoded binary objects storing big-endian,
              two's complement representations of the decimal's
              unscaled value.

              * **MinimumValue** *(dict) --*

                The lowest value in the column.

                * **UnscaledValue** *(bytes) --* **[REQUIRED]**

                  The unscaled numeric value.

                * **Scale** *(integer) --* **[REQUIRED]**

                  The scale that determines where the decimal point
                  falls in the unscaled value.

              * **MaximumValue** *(dict) --*

                The highest value in the column.

                * **UnscaledValue** *(bytes) --* **[REQUIRED]**

                  The unscaled numeric value.

                * **Scale** *(integer) --* **[REQUIRED]**

                  The scale that determines where the decimal point
                  falls in the unscaled value.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **DoubleColumnStatisticsData** *(dict) --*

              Double column statistics data.

              * **MinimumValue** *(float) --*

                The lowest value in the column.

              * **MaximumValue** *(float) --*

                The highest value in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **LongColumnStatisticsData** *(dict) --*

              Long column statistics data.

              * **MinimumValue** *(integer) --*

                The lowest value in the column.

              * **MaximumValue** *(integer) --*

                The highest value in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **StringColumnStatisticsData** *(dict) --*

              String column statistics data.

              * **MaximumLength** *(integer) --* **[REQUIRED]**

                The size of the longest string in the column.

              * **AverageLength** *(float) --* **[REQUIRED]**

                The average string length in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **BinaryColumnStatisticsData** *(dict) --*

              Binary column statistics data.

              * **MaximumLength** *(integer) --* **[REQUIRED]**

                The size of the longest bit sequence in the column.

              * **AverageLength** *(float) --* **[REQUIRED]**

                The average bit sequence length in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Errors': [
                 {
                     'ColumnStatistics': {
                         'ColumnName': 'string',
                         'ColumnType': 'string',
                         'AnalyzedTime': datetime(2015, 1, 1),
                         'StatisticsData': {
                             'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                             'BooleanColumnStatisticsData': {
                                 'NumberOfTrues': 123,
                                 'NumberOfFalses': 123,
                                 'NumberOfNulls': 123
                             },
                             'DateColumnStatisticsData': {
                                 'MinimumValue': datetime(2015, 1, 1),
                                 'MaximumValue': datetime(2015, 1, 1),
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'DecimalColumnStatisticsData': {
                                 'MinimumValue': {
                                     'UnscaledValue': b'bytes',
                                     'Scale': 123
                                 },
                                 'MaximumValue': {
                                     'UnscaledValue': b'bytes',
                                     'Scale': 123
                                 },
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'DoubleColumnStatisticsData': {
                                 'MinimumValue': 123.0,
                                 'MaximumValue': 123.0,
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'LongColumnStatisticsData': {
                                 'MinimumValue': 123,
                                 'MaximumValue': 123,
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'StringColumnStatisticsData': {
                                 'MaximumLength': 123,
                                 'AverageLength': 123.0,
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'BinaryColumnStatisticsData': {
                                 'MaximumLength': 123,
                                 'AverageLength': 123.0,
                                 'NumberOfNulls': 123
                             }
                         }
                     },
                     'Error': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Errors** *(list) --*

          List of ColumnStatisticsErrors.

          * *(dict) --*

            Encapsulates a "ColumnStatistics" object that failed and
            the reason for failure.

            * **ColumnStatistics** *(dict) --*

              The "ColumnStatistics" of the column.

              * **ColumnName** *(string) --*

                Name of column which statistics belong to.

              * **ColumnType** *(string) --*

                The data type of the column.

              * **AnalyzedTime** *(datetime) --*

                The timestamp of when column statistics were
                generated.

              * **StatisticsData** *(dict) --*

                A "ColumnStatisticData" object that contains the
                statistics data values.

                * **Type** *(string) --*

                  The type of column statistics data.

                * **BooleanColumnStatisticsData** *(dict) --*

                  Boolean column statistics data.

                  * **NumberOfTrues** *(integer) --*

                    The number of true values in the column.

                  * **NumberOfFalses** *(integer) --*

                    The number of false values in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                * **DateColumnStatisticsData** *(dict) --*

                  Date column statistics data.

                  * **MinimumValue** *(datetime) --*

                    The lowest value in the column.

                  * **MaximumValue** *(datetime) --*

                    The highest value in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **DecimalColumnStatisticsData** *(dict) --*

                  Decimal column statistics data. UnscaledValues
                  within are Base64-encoded binary objects storing
                  big-endian, two's complement representations of the
                  decimal's unscaled value.

                  * **MinimumValue** *(dict) --*

                    The lowest value in the column.

                    * **UnscaledValue** *(bytes) --*

                      The unscaled numeric value.

                    * **Scale** *(integer) --*

                      The scale that determines where the decimal
                      point falls in the unscaled value.

                  * **MaximumValue** *(dict) --*

                    The highest value in the column.

                    * **UnscaledValue** *(bytes) --*

                      The unscaled numeric value.

                    * **Scale** *(integer) --*

                      The scale that determines where the decimal
                      point falls in the unscaled value.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **DoubleColumnStatisticsData** *(dict) --*

                  Double column statistics data.

                  * **MinimumValue** *(float) --*

                    The lowest value in the column.

                  * **MaximumValue** *(float) --*

                    The highest value in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **LongColumnStatisticsData** *(dict) --*

                  Long column statistics data.

                  * **MinimumValue** *(integer) --*

                    The lowest value in the column.

                  * **MaximumValue** *(integer) --*

                    The highest value in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **StringColumnStatisticsData** *(dict) --*

                  String column statistics data.

                  * **MaximumLength** *(integer) --*

                    The size of the longest string in the column.

                  * **AverageLength** *(float) --*

                    The average string length in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **BinaryColumnStatisticsData** *(dict) --*

                  Binary column statistics data.

                  * **MaximumLength** *(integer) --*

                    The size of the longest bit sequence in the
                    column.

                  * **AverageLength** *(float) --*

                    The average bit sequence length in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

            * **Error** *(dict) --*

              An error message with the reason for the failure of an
              operation.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / create_connection


create_connection
*****************

Glue.Client.create_connection(**kwargs)

   Creates a connection definition in the Data Catalog.

   Connections used for creating federated resources require the IAM
   "glue:PassConnection" permission.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_connection(
          CatalogId='string',
          ConnectionInput={
              'Name': 'string',
              'Description': 'string',
              'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
              'MatchCriteria': [
                  'string',
              ],
              'ConnectionProperties': {
                  'string': 'string'
              },
              'SparkProperties': {
                  'string': 'string'
              },
              'AthenaProperties': {
                  'string': 'string'
              },
              'PythonProperties': {
                  'string': 'string'
              },
              'PhysicalConnectionRequirements': {
                  'SubnetId': 'string',
                  'SecurityGroupIdList': [
                      'string',
                  ],
                  'AvailabilityZone': 'string'
              },
              'AuthenticationConfiguration': {
                  'AuthenticationType': 'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                  'OAuth2Properties': {
                      'OAuth2GrantType': 'AUTHORIZATION_CODE'|'CLIENT_CREDENTIALS'|'JWT_BEARER',
                      'OAuth2ClientApplication': {
                          'UserManagedClientApplicationClientId': 'string',
                          'AWSManagedClientApplicationReference': 'string'
                      },
                      'TokenUrl': 'string',
                      'TokenUrlParametersMap': {
                          'string': 'string'
                      },
                      'AuthorizationCodeProperties': {
                          'AuthorizationCode': 'string',
                          'RedirectUri': 'string'
                      },
                      'OAuth2Credentials': {
                          'UserManagedClientApplicationClientSecret': 'string',
                          'AccessToken': 'string',
                          'RefreshToken': 'string',
                          'JwtToken': 'string'
                      }
                  },
                  'SecretArn': 'string',
                  'KmsKeyArn': 'string',
                  'BasicAuthenticationCredentials': {
                      'Username': 'string',
                      'Password': 'string'
                  },
                  'CustomAuthenticationCredentials': {
                      'string': 'string'
                  }
              },
              'ValidateCredentials': True|False,
              'ValidateForComputeEnvironments': [
                  'SPARK'|'ATHENA'|'PYTHON',
              ]
          },
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which to create the connection. If none is provided, the
        Amazon Web Services account ID is used by default.

      * **ConnectionInput** (*dict*) --

        **[REQUIRED]**

        A "ConnectionInput" object defining the connection to create.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the connection.

        * **Description** *(string) --*

          The description of the connection.

        * **ConnectionType** *(string) --* **[REQUIRED]**

          The type of the connection. Currently, these types are
          supported:

          * "JDBC" - Designates a connection to a database through
            Java Database Connectivity (JDBC). "JDBC" Connections use
            the following ConnectionParameters.

            * Required: All of ( "HOST", "PORT", "JDBC_ENGINE") or
              "JDBC_CONNECTION_URL".

            * Required: All of ( "USERNAME", "PASSWORD") or
              "SECRET_ID".

            * Optional: "JDBC_ENFORCE_SSL", "CUSTOM_JDBC_CERT",
              "CUSTOM_JDBC_CERT_STRING",
              "SKIP_CUSTOM_JDBC_CERT_VALIDATION". These parameters are
              used to configure SSL with JDBC.

          * "KAFKA" - Designates a connection to an Apache Kafka
            streaming platform. "KAFKA" Connections use the following
            ConnectionParameters.

            * Required: "KAFKA_BOOTSTRAP_SERVERS".

            * Optional: "KAFKA_SSL_ENABLED", "KAFKA_CUSTOM_CERT",
              "KAFKA_SKIP_CUSTOM_CERT_VALIDATION". These parameters
              are used to configure SSL with "KAFKA".

            * Optional: "KAFKA_CLIENT_KEYSTORE",
              "KAFKA_CLIENT_KEYSTORE_PASSWORD",
              "KAFKA_CLIENT_KEY_PASSWORD",
              "ENCRYPTED_KAFKA_CLIENT_KEYSTORE_PASSWORD",
              "ENCRYPTED_KAFKA_CLIENT_KEY_PASSWORD". These parameters
              are used to configure TLS client configuration with SSL
              in "KAFKA".

            * Optional: "KAFKA_SASL_MECHANISM". Can be specified as
              "SCRAM-SHA-512", "GSSAPI", or "AWS_MSK_IAM".

            * Optional: "KAFKA_SASL_SCRAM_USERNAME",
              "KAFKA_SASL_SCRAM_PASSWORD",
              "ENCRYPTED_KAFKA_SASL_SCRAM_PASSWORD". These parameters
              are used to configure SASL/SCRAM-SHA-512 authentication
              with "KAFKA".

            * Optional: "KAFKA_SASL_GSSAPI_KEYTAB",
              "KAFKA_SASL_GSSAPI_KRB5_CONF",
              "KAFKA_SASL_GSSAPI_SERVICE",
              "KAFKA_SASL_GSSAPI_PRINCIPAL". These parameters are used
              to configure SASL/GSSAPI authentication with "KAFKA".

          * "MONGODB" - Designates a connection to a MongoDB document
            database. "MONGODB" Connections use the following
            ConnectionParameters.

            * Required: "CONNECTION_URL".

            * Required: All of ( "USERNAME", "PASSWORD") or
              "SECRET_ID".

          * "VIEW_VALIDATION_REDSHIFT" - Designates a connection used
            for view validation by Amazon Redshift.

          * "VIEW_VALIDATION_ATHENA" - Designates a connection used
            for view validation by Amazon Athena.

          * "NETWORK" - Designates a network connection to a data
            source within an Amazon Virtual Private Cloud environment
            (Amazon VPC). "NETWORK" Connections do not require
            ConnectionParameters. Instead, provide a
            PhysicalConnectionRequirements.

          * "MARKETPLACE" - Uses configuration settings contained in a
            connector purchased from Amazon Web Services Marketplace
            to read from and write to data stores that are not
            natively supported by Glue. "MARKETPLACE" Connections use
            the following ConnectionParameters.

            * Required: "CONNECTOR_TYPE", "CONNECTOR_URL",
              "CONNECTOR_CLASS_NAME", "CONNECTION_URL".

            * Required for "JDBC" "CONNECTOR_TYPE" connections: All of
              ( "USERNAME", "PASSWORD") or "SECRET_ID".

          * "CUSTOM" - Uses configuration settings contained in a
            custom connector to read from and write to data stores
            that are not natively supported by Glue.

          Additionally, a "ConnectionType" for the following SaaS
          connectors is supported:

          * "FACEBOOKADS" - Designates a connection to Facebook Ads.

          * "GOOGLEADS" - Designates a connection to Google Ads.

          * "GOOGLESHEETS" - Designates a connection to Google Sheets.

          * "GOOGLEANALYTICS4" - Designates a connection to Google
            Analytics 4.

          * "HUBSPOT" - Designates a connection to HubSpot.

          * "INSTAGRAMADS" - Designates a connection to Instagram Ads.

          * "INTERCOM" - Designates a connection to Intercom.

          * "JIRACLOUD" - Designates a connection to Jira Cloud.

          * "MARKETO" - Designates a connection to Adobe Marketo
            Engage.

          * "NETSUITEERP" - Designates a connection to Oracle
            NetSuite.

          * "SALESFORCE" - Designates a connection to Salesforce using
            OAuth authentication.

          * "SALESFORCEMARKETINGCLOUD" - Designates a connection to
            Salesforce Marketing Cloud.

          * "SALESFORCEPARDOT" - Designates a connection to Salesforce
            Marketing Cloud Account Engagement (MCAE).

          * "SAPODATA" - Designates a connection to SAP OData.

          * "SERVICENOW" - Designates a connection to ServiceNow.

          * "SLACK" - Designates a connection to Slack.

          * "SNAPCHATADS" - Designates a connection to Snapchat Ads.

          * "STRIPE" - Designates a connection to Stripe.

          * "ZENDESK" - Designates a connection to Zendesk.

          * "ZOHOCRM" - Designates a connection to Zoho CRM.

          For more information on the connection parameters needed for
          a particular connector, see the documentation for the
          connector in >>`<<Adding an Glue connection
          <https://docs.aws.amazon.com/glue/latest/dg/console-
          connections.html>`__in the Glue User Guide.

          "SFTP" is not supported.

          For more information about how optional ConnectionProperties
          are used to configure features in Glue, consult Glue
          connection properties.

          For more information about how optional ConnectionProperties
          are used to configure features in Glue Studio, consult Using
          connectors and connections.

        * **MatchCriteria** *(list) --*

          A list of criteria that can be used in selecting this
          connection.

          * *(string) --*

        * **ConnectionProperties** *(dict) --* **[REQUIRED]**

          These key-value pairs define parameters for the connection.

          * *(string) --*

            * *(string) --*

        * **SparkProperties** *(dict) --*

          Connection properties specific to the Spark compute
          environment.

          * *(string) --*

            * *(string) --*

        * **AthenaProperties** *(dict) --*

          Connection properties specific to the Athena compute
          environment.

          * *(string) --*

            * *(string) --*

        * **PythonProperties** *(dict) --*

          Connection properties specific to the Python compute
          environment.

          * *(string) --*

            * *(string) --*

        * **PhysicalConnectionRequirements** *(dict) --*

          The physical connection requirements, such as virtual
          private cloud (VPC) and "SecurityGroup", that are needed to
          successfully make this connection.

          * **SubnetId** *(string) --*

            The subnet ID used by the connection.

          * **SecurityGroupIdList** *(list) --*

            The security group ID list used by the connection.

            * *(string) --*

          * **AvailabilityZone** *(string) --*

            The connection's Availability Zone.

        * **AuthenticationConfiguration** *(dict) --*

          The authentication properties of the connection.

          * **AuthenticationType** *(string) --*

            A structure containing the authentication configuration in
            the CreateConnection request.

          * **OAuth2Properties** *(dict) --*

            The properties for OAuth2 authentication in the
            CreateConnection request.

            * **OAuth2GrantType** *(string) --*

              The OAuth2 grant type in the CreateConnection request.
              For example, "AUTHORIZATION_CODE", "JWT_BEARER", or
              "CLIENT_CREDENTIALS".

            * **OAuth2ClientApplication** *(dict) --*

              The client application type in the CreateConnection
              request. For example, "AWS_MANAGED" or "USER_MANAGED".

              * **UserManagedClientApplicationClientId** *(string) --*

                The client application clientID if the ClientAppType
                is "USER_MANAGED".

              * **AWSManagedClientApplicationReference** *(string) --*

                The reference to the SaaS-side client app that is
                Amazon Web Services managed.

            * **TokenUrl** *(string) --*

              The URL of the provider's authentication server, to
              exchange an authorization code for an access token.

            * **TokenUrlParametersMap** *(dict) --*

              A map of parameters that are added to the token "GET"
              request.

              * *(string) --*

                * *(string) --*

            * **AuthorizationCodeProperties** *(dict) --*

              The set of properties required for the the OAuth2
              "AUTHORIZATION_CODE" grant type.

              * **AuthorizationCode** *(string) --*

                An authorization code to be used in the third leg of
                the "AUTHORIZATION_CODE" grant workflow. This is a
                single-use code which becomes invalid once exchanged
                for an access token, thus it is acceptable to have
                this value as a request parameter.

              * **RedirectUri** *(string) --*

                The redirect URI where the user gets redirected to by
                authorization server when issuing an authorization
                code. The URI is subsequently used when the
                authorization code is exchanged for an access token.

            * **OAuth2Credentials** *(dict) --*

              The credentials used when the authentication type is
              OAuth2 authentication.

              * **UserManagedClientApplicationClientSecret** *(string)
                --*

                The client application client secret if the client
                application is user managed.

              * **AccessToken** *(string) --*

                The access token used when the authentication type is
                OAuth2.

              * **RefreshToken** *(string) --*

                The refresh token used when the authentication type is
                OAuth2.

              * **JwtToken** *(string) --*

                The JSON Web Token (JWT) used when the authentication
                type is OAuth2.

          * **SecretArn** *(string) --*

            The secret manager ARN to store credentials in the
            CreateConnection request.

          * **KmsKeyArn** *(string) --*

            The ARN of the KMS key used to encrypt the connection.
            Only taken an as input in the request and stored in the
            Secret Manager.

          * **BasicAuthenticationCredentials** *(dict) --*

            The credentials used when the authentication type is basic
            authentication.

            * **Username** *(string) --*

              The username to connect to the data source.

            * **Password** *(string) --*

              The password to connect to the data source.

          * **CustomAuthenticationCredentials** *(dict) --*

            The credentials used when the authentication type is
            custom authentication.

            * *(string) --*

              * *(string) --*

        * **ValidateCredentials** *(boolean) --*

          A flag to validate the credentials during create connection.
          Default is true.

        * **ValidateForComputeEnvironments** *(list) --*

          The compute environments that the specified connection
          properties are validated against.

          * *(string) --*

      * **Tags** (*dict*) --

        The tags you assign to the connection.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'CreateConnectionStatus': 'READY'|'IN_PROGRESS'|'FAILED'
         }

      **Response Structure**

      * *(dict) --*

        * **CreateConnectionStatus** *(string) --*

          The status of the connection creation request. The request
          can take some time for certain authentication types, for
          example when creating an OAuth connection with token
          exchange over VPC.

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / batch_update_partition


batch_update_partition
**********************

Glue.Client.batch_update_partition(**kwargs)

   Updates one or more partitions in a batch operation.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_update_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          Entries=[
              {
                  'PartitionValueList': [
                      'string',
                  ],
                  'PartitionInput': {
                      'Values': [
                          'string',
                      ],
                      'LastAccessTime': datetime(2015, 1, 1),
                      'StorageDescriptor': {
                          'Columns': [
                              {
                                  'Name': 'string',
                                  'Type': 'string',
                                  'Comment': 'string',
                                  'Parameters': {
                                      'string': 'string'
                                  }
                              },
                          ],
                          'Location': 'string',
                          'AdditionalLocations': [
                              'string',
                          ],
                          'InputFormat': 'string',
                          'OutputFormat': 'string',
                          'Compressed': True|False,
                          'NumberOfBuckets': 123,
                          'SerdeInfo': {
                              'Name': 'string',
                              'SerializationLibrary': 'string',
                              'Parameters': {
                                  'string': 'string'
                              }
                          },
                          'BucketColumns': [
                              'string',
                          ],
                          'SortColumns': [
                              {
                                  'Column': 'string',
                                  'SortOrder': 123
                              },
                          ],
                          'Parameters': {
                              'string': 'string'
                          },
                          'SkewedInfo': {
                              'SkewedColumnNames': [
                                  'string',
                              ],
                              'SkewedColumnValues': [
                                  'string',
                              ],
                              'SkewedColumnValueLocationMaps': {
                                  'string': 'string'
                              }
                          },
                          'StoredAsSubDirectories': True|False,
                          'SchemaReference': {
                              'SchemaId': {
                                  'SchemaArn': 'string',
                                  'SchemaName': 'string',
                                  'RegistryName': 'string'
                              },
                              'SchemaVersionId': 'string',
                              'SchemaVersionNumber': 123
                          }
                      },
                      'Parameters': {
                          'string': 'string'
                      },
                      'LastAnalyzedTime': datetime(2015, 1, 1)
                  }
              },
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the catalog in which the
        partition is to be updated. Currently, this should be the
        Amazon Web Services account ID.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the metadata database in which the partition is to
        be updated.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the metadata table in which the partition is to be
        updated.

      * **Entries** (*list*) --

        **[REQUIRED]**

        A list of up to 100 "BatchUpdatePartitionRequestEntry" objects
        to update.

        * *(dict) --*

          A structure that contains the values and structure used to
          update a partition.

          * **PartitionValueList** *(list) --* **[REQUIRED]**

            A list of values defining the partitions.

            * *(string) --*

          * **PartitionInput** *(dict) --* **[REQUIRED]**

            The structure used to update a partition.

            * **Values** *(list) --*

              The values of the partition. Although this parameter is
              not required by the SDK, you must specify this parameter
              for a valid input.

              The values for the keys for the new partition must be
              passed as an array of String objects that must be
              ordered in the same order as the partition keys
              appearing in the Amazon S3 prefix. Otherwise Glue will
              add the values to the wrong keys.

              * *(string) --*

            * **LastAccessTime** *(datetime) --*

              The last time at which the partition was accessed.

            * **StorageDescriptor** *(dict) --*

              Provides information about the physical location where
              the partition is stored.

              * **Columns** *(list) --*

                A list of the "Columns" in the table.

                * *(dict) --*

                  A column in a "Table".

                  * **Name** *(string) --* **[REQUIRED]**

                    The name of the "Column".

                  * **Type** *(string) --*

                    The data type of the "Column".

                  * **Comment** *(string) --*

                    A free-form text comment.

                  * **Parameters** *(dict) --*

                    These key-value pairs define properties associated
                    with the column.

                    * *(string) --*

                      * *(string) --*

              * **Location** *(string) --*

                The physical location of the table. By default, this
                takes the form of the warehouse location, followed by
                the database location in the warehouse, followed by
                the table name.

              * **AdditionalLocations** *(list) --*

                A list of locations that point to the path where a
                Delta table is located.

                * *(string) --*

              * **InputFormat** *(string) --*

                The input format: "SequenceFileInputFormat" (binary),
                or "TextInputFormat", or a custom format.

              * **OutputFormat** *(string) --*

                The output format: "SequenceFileOutputFormat"
                (binary), or "IgnoreKeyTextOutputFormat", or a custom
                format.

              * **Compressed** *(boolean) --*

                "True" if the data in the table is compressed, or
                "False" if not.

              * **NumberOfBuckets** *(integer) --*

                Must be specified if the table contains any dimension
                columns.

              * **SerdeInfo** *(dict) --*

                The serialization/deserialization (SerDe) information.

                * **Name** *(string) --*

                  Name of the SerDe.

                * **SerializationLibrary** *(string) --*

                  Usually the class that implements the SerDe. An
                  example is "org.apache.hadoop.hive.serde2.columnar.
                  ColumnarSerDe".

                * **Parameters** *(dict) --*

                  These key-value pairs define initialization
                  parameters for the SerDe.

                  * *(string) --*

                    * *(string) --*

              * **BucketColumns** *(list) --*

                A list of reducer grouping columns, clustering
                columns, and bucketing columns in the table.

                * *(string) --*

              * **SortColumns** *(list) --*

                A list specifying the sort order of each bucket in the
                table.

                * *(dict) --*

                  Specifies the sort order of a sorted column.

                  * **Column** *(string) --* **[REQUIRED]**

                    The name of the column.

                  * **SortOrder** *(integer) --* **[REQUIRED]**

                    Indicates that the column is sorted in ascending
                    order ( "== 1"), or in descending order ( "==0").

              * **Parameters** *(dict) --*

                The user-supplied properties in key-value form.

                * *(string) --*

                  * *(string) --*

              * **SkewedInfo** *(dict) --*

                The information about values that appear frequently in
                a column (skewed values).

                * **SkewedColumnNames** *(list) --*

                  A list of names of columns that contain skewed
                  values.

                  * *(string) --*

                * **SkewedColumnValues** *(list) --*

                  A list of values that appear so frequently as to be
                  considered skewed.

                  * *(string) --*

                * **SkewedColumnValueLocationMaps** *(dict) --*

                  A mapping of skewed values to the columns that
                  contain them.

                  * *(string) --*

                    * *(string) --*

              * **StoredAsSubDirectories** *(boolean) --*

                "True" if the table data is stored in subdirectories,
                or "False" if not.

              * **SchemaReference** *(dict) --*

                An object that references a schema stored in the Glue
                Schema Registry.

                When creating a table, you can pass an empty list of
                columns for the schema, and instead use a schema
                reference.

                * **SchemaId** *(dict) --*

                  A structure that contains schema identity fields.
                  Either this or the "SchemaVersionId" has to be
                  provided.

                  * **SchemaArn** *(string) --*

                    The Amazon Resource Name (ARN) of the schema. One
                    of "SchemaArn" or "SchemaName" has to be provided.

                  * **SchemaName** *(string) --*

                    The name of the schema. One of "SchemaArn" or
                    "SchemaName" has to be provided.

                  * **RegistryName** *(string) --*

                    The name of the schema registry that contains the
                    schema.

                * **SchemaVersionId** *(string) --*

                  The unique ID assigned to a version of the schema.
                  Either this or the "SchemaId" has to be provided.

                * **SchemaVersionNumber** *(integer) --*

                  The version number of the schema.

            * **Parameters** *(dict) --*

              These key-value pairs define partition parameters.

              * *(string) --*

                * *(string) --*

            * **LastAnalyzedTime** *(datetime) --*

              The last time at which column statistics were computed
              for this partition.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Errors': [
                 {
                     'PartitionValueList': [
                         'string',
                     ],
                     'ErrorDetail': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Errors** *(list) --*

          The errors encountered when trying to update the requested
          partitions. A list of "BatchUpdatePartitionFailureEntry"
          objects.

          * *(dict) --*

            Contains information about a batch update partition error.

            * **PartitionValueList** *(list) --*

              A list of values defining the partitions.

              * *(string) --*

            * **ErrorDetail** *(dict) --*

              The details about the batch update partition error.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / update_classifier


update_classifier
*****************

Glue.Client.update_classifier(**kwargs)

   Modifies an existing classifier (a "GrokClassifier", an
   "XMLClassifier", a "JsonClassifier", or a "CsvClassifier",
   depending on which field is present).

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_classifier(
          GrokClassifier={
              'Name': 'string',
              'Classification': 'string',
              'GrokPattern': 'string',
              'CustomPatterns': 'string'
          },
          XMLClassifier={
              'Name': 'string',
              'Classification': 'string',
              'RowTag': 'string'
          },
          JsonClassifier={
              'Name': 'string',
              'JsonPath': 'string'
          },
          CsvClassifier={
              'Name': 'string',
              'Delimiter': 'string',
              'QuoteSymbol': 'string',
              'ContainsHeader': 'UNKNOWN'|'PRESENT'|'ABSENT',
              'Header': [
                  'string',
              ],
              'DisableValueTrimming': True|False,
              'AllowSingleColumn': True|False,
              'CustomDatatypeConfigured': True|False,
              'CustomDatatypes': [
                  'string',
              ],
              'Serde': 'OpenCSVSerDe'|'LazySimpleSerDe'|'None'
          }
      )

   Parameters:
      * **GrokClassifier** (*dict*) --

        A "GrokClassifier" object with updated fields.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the "GrokClassifier".

        * **Classification** *(string) --*

          An identifier of the data format that the classifier
          matches, such as Twitter, JSON, Omniture logs, Amazon
          CloudWatch Logs, and so on.

        * **GrokPattern** *(string) --*

          The grok pattern used by this classifier.

        * **CustomPatterns** *(string) --*

          Optional custom grok patterns used by this classifier.

      * **XMLClassifier** (*dict*) --

        An "XMLClassifier" object with updated fields.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the classifier.

        * **Classification** *(string) --*

          An identifier of the data format that the classifier
          matches.

        * **RowTag** *(string) --*

          The XML tag designating the element that contains each
          record in an XML document being parsed. This cannot identify
          a self-closing element (closed by "/>"). An empty row
          element that contains only attributes can be parsed as long
          as it ends with a closing tag (for example, "<row item_a="A"
          item_b="B"></row>" is okay, but "<row item_a="A" item_b="B"
          />" is not).

      * **JsonClassifier** (*dict*) --

        A "JsonClassifier" object with updated fields.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the classifier.

        * **JsonPath** *(string) --*

          A "JsonPath" string defining the JSON data for the
          classifier to classify. Glue supports a subset of JsonPath,
          as described in Writing JsonPath Custom Classifiers.

      * **CsvClassifier** (*dict*) --

        A "CsvClassifier" object with updated fields.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the classifier.

        * **Delimiter** *(string) --*

          A custom symbol to denote what separates each column entry
          in the row.

        * **QuoteSymbol** *(string) --*

          A custom symbol to denote what combines content into a
          single column value. It must be different from the column
          delimiter.

        * **ContainsHeader** *(string) --*

          Indicates whether the CSV file contains a header.

        * **Header** *(list) --*

          A list of strings representing column names.

          * *(string) --*

        * **DisableValueTrimming** *(boolean) --*

          Specifies not to trim values before identifying the type of
          column values. The default value is true.

        * **AllowSingleColumn** *(boolean) --*

          Enables the processing of files that contain only one
          column.

        * **CustomDatatypeConfigured** *(boolean) --*

          Specifies the configuration of custom datatypes.

        * **CustomDatatypes** *(list) --*

          Specifies a list of supported custom datatypes.

          * *(string) --*

        * **Serde** *(string) --*

          Sets the SerDe for processing CSV in the classifier, which
          will be applied in the Data Catalog. Valid values are
          "OpenCSVSerDe", "LazySimpleSerDe", and "None". You can
          specify the "None" value when you want the crawler to do the
          detection.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.VersionMismatchException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / start_blueprint_run


start_blueprint_run
*******************

Glue.Client.start_blueprint_run(**kwargs)

   Starts a new run of the specified blueprint.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_blueprint_run(
          BlueprintName='string',
          Parameters='string',
          RoleArn='string'
      )

   Parameters:
      * **BlueprintName** (*string*) --

        **[REQUIRED]**

        The name of the blueprint.

      * **Parameters** (*string*) -- Specifies the parameters as a
        "BlueprintParameters" object.

      * **RoleArn** (*string*) --

        **[REQUIRED]**

        Specifies the IAM role used to create the workflow.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **RunId** *(string) --*

          The run ID for this blueprint run.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.IllegalBlueprintStateException"
Glue / Client / delete_workflow


delete_workflow
***************

Glue.Client.delete_workflow(**kwargs)

   Deletes a workflow.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_workflow(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      Name of the workflow to be deleted.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          Name of the workflow specified in input.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / batch_get_crawlers


batch_get_crawlers
******************

Glue.Client.batch_get_crawlers(**kwargs)

   Returns a list of resource metadata for a given list of crawler
   names. After calling the "ListCrawlers" operation, you can call
   this operation to access the data to which you have been granted
   permissions. This operation supports all IAM permissions, including
   permission conditions that uses tags.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_crawlers(
          CrawlerNames=[
              'string',
          ]
      )

   Parameters:
      **CrawlerNames** (*list*) --

      **[REQUIRED]**

      A list of crawler names, which might be the names returned from
      the "ListCrawlers" operation.

      * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Crawlers': [
                 {
                     'Name': 'string',
                     'Role': 'string',
                     'Targets': {
                         'S3Targets': [
                             {
                                 'Path': 'string',
                                 'Exclusions': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'SampleSize': 123,
                                 'EventQueueArn': 'string',
                                 'DlqEventQueueArn': 'string'
                             },
                         ],
                         'JdbcTargets': [
                             {
                                 'ConnectionName': 'string',
                                 'Path': 'string',
                                 'Exclusions': [
                                     'string',
                                 ],
                                 'EnableAdditionalMetadata': [
                                     'COMMENTS'|'RAWTYPES',
                                 ]
                             },
                         ],
                         'MongoDBTargets': [
                             {
                                 'ConnectionName': 'string',
                                 'Path': 'string',
                                 'ScanAll': True|False
                             },
                         ],
                         'DynamoDBTargets': [
                             {
                                 'Path': 'string',
                                 'scanAll': True|False,
                                 'scanRate': 123.0
                             },
                         ],
                         'CatalogTargets': [
                             {
                                 'DatabaseName': 'string',
                                 'Tables': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'EventQueueArn': 'string',
                                 'DlqEventQueueArn': 'string'
                             },
                         ],
                         'DeltaTargets': [
                             {
                                 'DeltaTables': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'WriteManifest': True|False,
                                 'CreateNativeDeltaTable': True|False
                             },
                         ],
                         'IcebergTargets': [
                             {
                                 'Paths': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'Exclusions': [
                                     'string',
                                 ],
                                 'MaximumTraversalDepth': 123
                             },
                         ],
                         'HudiTargets': [
                             {
                                 'Paths': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'Exclusions': [
                                     'string',
                                 ],
                                 'MaximumTraversalDepth': 123
                             },
                         ]
                     },
                     'DatabaseName': 'string',
                     'Description': 'string',
                     'Classifiers': [
                         'string',
                     ],
                     'RecrawlPolicy': {
                         'RecrawlBehavior': 'CRAWL_EVERYTHING'|'CRAWL_NEW_FOLDERS_ONLY'|'CRAWL_EVENT_MODE'
                     },
                     'SchemaChangePolicy': {
                         'UpdateBehavior': 'LOG'|'UPDATE_IN_DATABASE',
                         'DeleteBehavior': 'LOG'|'DELETE_FROM_DATABASE'|'DEPRECATE_IN_DATABASE'
                     },
                     'LineageConfiguration': {
                         'CrawlerLineageSettings': 'ENABLE'|'DISABLE'
                     },
                     'State': 'READY'|'RUNNING'|'STOPPING',
                     'TablePrefix': 'string',
                     'Schedule': {
                         'ScheduleExpression': 'string',
                         'State': 'SCHEDULED'|'NOT_SCHEDULED'|'TRANSITIONING'
                     },
                     'CrawlElapsedTime': 123,
                     'CreationTime': datetime(2015, 1, 1),
                     'LastUpdated': datetime(2015, 1, 1),
                     'LastCrawl': {
                         'Status': 'SUCCEEDED'|'CANCELLED'|'FAILED',
                         'ErrorMessage': 'string',
                         'LogGroup': 'string',
                         'LogStream': 'string',
                         'MessagePrefix': 'string',
                         'StartTime': datetime(2015, 1, 1)
                     },
                     'Version': 123,
                     'Configuration': 'string',
                     'CrawlerSecurityConfiguration': 'string',
                     'LakeFormationConfiguration': {
                         'UseLakeFormationCredentials': True|False,
                         'AccountId': 'string'
                     }
                 },
             ],
             'CrawlersNotFound': [
                 'string',
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Crawlers** *(list) --*

          A list of crawler definitions.

          * *(dict) --*

            Specifies a crawler program that examines a data source
            and uses classifiers to try to determine its schema. If
            successful, the crawler records metadata concerning the
            data source in the Glue Data Catalog.

            * **Name** *(string) --*

              The name of the crawler.

            * **Role** *(string) --*

              The Amazon Resource Name (ARN) of an IAM role that's
              used to access customer resources, such as Amazon Simple
              Storage Service (Amazon S3) data.

            * **Targets** *(dict) --*

              A collection of targets to crawl.

              * **S3Targets** *(list) --*

                Specifies Amazon Simple Storage Service (Amazon S3)
                targets.

                * *(dict) --*

                  Specifies a data store in Amazon Simple Storage
                  Service (Amazon S3).

                  * **Path** *(string) --*

                    The path to the Amazon S3 target.

                  * **Exclusions** *(list) --*

                    A list of glob patterns used to exclude from the
                    crawl. For more information, see Catalog Tables
                    with a Crawler.

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of a connection which allows a job or
                    crawler to access data in Amazon S3 within an
                    Amazon Virtual Private Cloud environment (Amazon
                    VPC).

                  * **SampleSize** *(integer) --*

                    Sets the number of files in each leaf folder to be
                    crawled when crawling sample files in a dataset.
                    If not set, all the files are crawled. A valid
                    value is an integer between 1 and 249.

                  * **EventQueueArn** *(string) --*

                    A valid Amazon SQS ARN. For example,
                    "arn:aws:sqs:region:account:sqs".

                  * **DlqEventQueueArn** *(string) --*

                    A valid Amazon dead-letter SQS ARN. For example,
                    "arn:aws:sqs:region:account:deadLetterQueue".

              * **JdbcTargets** *(list) --*

                Specifies JDBC targets.

                * *(dict) --*

                  Specifies a JDBC data store to crawl.

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the JDBC target.

                  * **Path** *(string) --*

                    The path of the JDBC target.

                  * **Exclusions** *(list) --*

                    A list of glob patterns used to exclude from the
                    crawl. For more information, see Catalog Tables
                    with a Crawler.

                    * *(string) --*

                  * **EnableAdditionalMetadata** *(list) --*

                    Specify a value of "RAWTYPES" or "COMMENTS" to
                    enable additional metadata in table responses.
                    "RAWTYPES" provides the native-level datatype.
                    "COMMENTS" provides comments associated with a
                    column or table in the database.

                    If you do not need additional metadata, keep the
                    field empty.

                    * *(string) --*

              * **MongoDBTargets** *(list) --*

                Specifies Amazon DocumentDB or MongoDB targets.

                * *(dict) --*

                  Specifies an Amazon DocumentDB or MongoDB data store
                  to crawl.

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the Amazon DocumentDB or MongoDB target.

                  * **Path** *(string) --*

                    The path of the Amazon DocumentDB or MongoDB
                    target (database/collection).

                  * **ScanAll** *(boolean) --*

                    Indicates whether to scan all the records, or to
                    sample rows from the table. Scanning all the
                    records can take a long time when the table is not
                    a high throughput table.

                    A value of "true" means to scan all records, while
                    a value of "false" means to sample the records. If
                    no value is specified, the value defaults to
                    "true".

              * **DynamoDBTargets** *(list) --*

                Specifies Amazon DynamoDB targets.

                * *(dict) --*

                  Specifies an Amazon DynamoDB table to crawl.

                  * **Path** *(string) --*

                    The name of the DynamoDB table to crawl.

                  * **scanAll** *(boolean) --*

                    Indicates whether to scan all the records, or to
                    sample rows from the table. Scanning all the
                    records can take a long time when the table is not
                    a high throughput table.

                    A value of "true" means to scan all records, while
                    a value of "false" means to sample the records. If
                    no value is specified, the value defaults to
                    "true".

                  * **scanRate** *(float) --*

                    The percentage of the configured read capacity
                    units to use by the Glue crawler. Read capacity
                    units is a term defined by DynamoDB, and is a
                    numeric value that acts as rate limiter for the
                    number of reads that can be performed on that
                    table per second.

                    The valid values are null or a value between 0.1
                    to 1.5. A null value is used when user does not
                    provide a value, and defaults to 0.5 of the
                    configured Read Capacity Unit (for provisioned
                    tables), or 0.25 of the max configured Read
                    Capacity Unit (for tables using on-demand mode).

              * **CatalogTargets** *(list) --*

                Specifies Glue Data Catalog targets.

                * *(dict) --*

                  Specifies an Glue Data Catalog target.

                  * **DatabaseName** *(string) --*

                    The name of the database to be synchronized.

                  * **Tables** *(list) --*

                    A list of the tables to be synchronized.

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of the connection for an Amazon S3-backed
                    Data Catalog table to be a target of the crawl
                    when using a "Catalog" connection type paired with
                    a "NETWORK" Connection type.

                  * **EventQueueArn** *(string) --*

                    A valid Amazon SQS ARN. For example,
                    "arn:aws:sqs:region:account:sqs".

                  * **DlqEventQueueArn** *(string) --*

                    A valid Amazon dead-letter SQS ARN. For example,
                    "arn:aws:sqs:region:account:deadLetterQueue".

              * **DeltaTargets** *(list) --*

                Specifies Delta data store targets.

                * *(dict) --*

                  Specifies a Delta data store to crawl one or more
                  Delta tables.

                  * **DeltaTables** *(list) --*

                    A list of the Amazon S3 paths to the Delta tables.

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the Delta table target.

                  * **WriteManifest** *(boolean) --*

                    Specifies whether to write the manifest files to
                    the Delta table path.

                  * **CreateNativeDeltaTable** *(boolean) --*

                    Specifies whether the crawler will create native
                    tables, to allow integration with query engines
                    that support querying of the Delta transaction log
                    directly.

              * **IcebergTargets** *(list) --*

                Specifies Apache Iceberg data store targets.

                * *(dict) --*

                  Specifies an Apache Iceberg data source where
                  Iceberg tables are stored in Amazon S3.

                  * **Paths** *(list) --*

                    One or more Amazon S3 paths that contains Iceberg
                    metadata folders as "s3://bucket/prefix".

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the Iceberg target.

                  * **Exclusions** *(list) --*

                    A list of glob patterns used to exclude from the
                    crawl. For more information, see Catalog Tables
                    with a Crawler.

                    * *(string) --*

                  * **MaximumTraversalDepth** *(integer) --*

                    The maximum depth of Amazon S3 paths that the
                    crawler can traverse to discover the Iceberg
                    metadata folder in your Amazon S3 path. Used to
                    limit the crawler run time.

              * **HudiTargets** *(list) --*

                Specifies Apache Hudi data store targets.

                * *(dict) --*

                  Specifies an Apache Hudi data source.

                  * **Paths** *(list) --*

                    An array of Amazon S3 location strings for Hudi,
                    each indicating the root folder with which the
                    metadata files for a Hudi table resides. The Hudi
                    folder may be located in a child folder of the
                    root folder.

                    The crawler will scan all folders underneath a
                    path for a Hudi folder.

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the Hudi target. If your Hudi files are stored in
                    buckets that require VPC authorization, you can
                    set their connection properties here.

                  * **Exclusions** *(list) --*

                    A list of glob patterns used to exclude from the
                    crawl. For more information, see Catalog Tables
                    with a Crawler.

                    * *(string) --*

                  * **MaximumTraversalDepth** *(integer) --*

                    The maximum depth of Amazon S3 paths that the
                    crawler can traverse to discover the Hudi metadata
                    folder in your Amazon S3 path. Used to limit the
                    crawler run time.

            * **DatabaseName** *(string) --*

              The name of the database in which the crawler's output
              is stored.

            * **Description** *(string) --*

              A description of the crawler.

            * **Classifiers** *(list) --*

              A list of UTF-8 strings that specify the custom
              classifiers that are associated with the crawler.

              * *(string) --*

            * **RecrawlPolicy** *(dict) --*

              A policy that specifies whether to crawl the entire
              dataset again, or to crawl only folders that were added
              since the last crawler run.

              * **RecrawlBehavior** *(string) --*

                Specifies whether to crawl the entire dataset again or
                to crawl only folders that were added since the last
                crawler run.

                A value of "CRAWL_EVERYTHING" specifies crawling the
                entire dataset again.

                A value of "CRAWL_NEW_FOLDERS_ONLY" specifies crawling
                only folders that were added since the last crawler
                run.

                A value of "CRAWL_EVENT_MODE" specifies crawling only
                the changes identified by Amazon S3 events.

            * **SchemaChangePolicy** *(dict) --*

              The policy that specifies update and delete behaviors
              for the crawler.

              * **UpdateBehavior** *(string) --*

                The update behavior when the crawler finds a changed
                schema.

              * **DeleteBehavior** *(string) --*

                The deletion behavior when the crawler finds a deleted
                object.

            * **LineageConfiguration** *(dict) --*

              A configuration that specifies whether data lineage is
              enabled for the crawler.

              * **CrawlerLineageSettings** *(string) --*

                Specifies whether data lineage is enabled for the
                crawler. Valid values are:

                * ENABLE: enables data lineage for the crawler

                * DISABLE: disables data lineage for the crawler

            * **State** *(string) --*

              Indicates whether the crawler is running, or whether a
              run is pending.

            * **TablePrefix** *(string) --*

              The prefix added to the names of tables that are
              created.

            * **Schedule** *(dict) --*

              For scheduled crawlers, the schedule when the crawler
              runs.

              * **ScheduleExpression** *(string) --*

                A "cron" expression used to specify the schedule (see
                Time-Based Schedules for Jobs and Crawlers. For
                example, to run something every day at 12:15 UTC, you
                would specify: "cron(15 12 * * ? *)".

              * **State** *(string) --*

                The state of the schedule.

            * **CrawlElapsedTime** *(integer) --*

              If the crawler is running, contains the total time
              elapsed since the last crawl began.

            * **CreationTime** *(datetime) --*

              The time that the crawler was created.

            * **LastUpdated** *(datetime) --*

              The time that the crawler was last updated.

            * **LastCrawl** *(dict) --*

              The status of the last crawl, and potentially error
              information if an error occurred.

              * **Status** *(string) --*

                Status of the last crawl.

              * **ErrorMessage** *(string) --*

                If an error occurred, the error information about the
                last crawl.

              * **LogGroup** *(string) --*

                The log group for the last crawl.

              * **LogStream** *(string) --*

                The log stream for the last crawl.

              * **MessagePrefix** *(string) --*

                The prefix for a message about this crawl.

              * **StartTime** *(datetime) --*

                The time at which the crawl started.

            * **Version** *(integer) --*

              The version of the crawler.

            * **Configuration** *(string) --*

              Crawler configuration information. This versioned JSON
              string allows users to specify aspects of a crawler's
              behavior. For more information, see Setting crawler
              configuration options.

            * **CrawlerSecurityConfiguration** *(string) --*

              The name of the "SecurityConfiguration" structure to be
              used by this crawler.

            * **LakeFormationConfiguration** *(dict) --*

              Specifies whether the crawler should use Lake Formation
              credentials for the crawler instead of the IAM role
              credentials.

              * **UseLakeFormationCredentials** *(boolean) --*

                Specifies whether to use Lake Formation credentials
                for the crawler instead of the IAM role credentials.

              * **AccountId** *(string) --*

                Required for cross account crawls. For same account
                crawls as the target data, this can be left as null.

        * **CrawlersNotFound** *(list) --*

          A list of names of crawlers that were not found.

          * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / list_usage_profiles


list_usage_profiles
*******************

Glue.Client.list_usage_profiles(**kwargs)

   List all the Glue usage profiles.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_usage_profiles(
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, included if
        this is a continuation call.

      * **MaxResults** (*integer*) -- The maximum number of usage
        profiles to return in a single response.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Profiles': [
                 {
                     'Name': 'string',
                     'Description': 'string',
                     'CreatedOn': datetime(2015, 1, 1),
                     'LastModifiedOn': datetime(2015, 1, 1)
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Profiles** *(list) --*

          A list of usage profile ( "UsageProfileDefinition") objects.

          * *(dict) --*

            Describes an Glue usage profile.

            * **Name** *(string) --*

              The name of the usage profile.

            * **Description** *(string) --*

              A description of the usage profile.

            * **CreatedOn** *(datetime) --*

              The date and time when the usage profile was created.

            * **LastModifiedOn** *(datetime) --*

              The date and time when the usage profile was last
              modified.

        * **NextToken** *(string) --*

          A continuation token, present if the current list segment is
          not the last.

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationNotSupportedException"
Glue / Client / batch_put_data_quality_statistic_annotation


batch_put_data_quality_statistic_annotation
*******************************************

Glue.Client.batch_put_data_quality_statistic_annotation(**kwargs)

   Annotate datapoints over time for a specific data quality
   statistic.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_put_data_quality_statistic_annotation(
          InclusionAnnotations=[
              {
                  'ProfileId': 'string',
                  'StatisticId': 'string',
                  'InclusionAnnotation': 'INCLUDE'|'EXCLUDE'
              },
          ],
          ClientToken='string'
      )

   Parameters:
      * **InclusionAnnotations** (*list*) --

        **[REQUIRED]**

        A list of "DatapointInclusionAnnotation"'s.

        * *(dict) --*

          An Inclusion Annotation.

          * **ProfileId** *(string) --*

            The ID of the data quality profile the statistic belongs
            to.

          * **StatisticId** *(string) --*

            The Statistic ID.

          * **InclusionAnnotation** *(string) --*

            The inclusion annotation value to apply to the statistic.

      * **ClientToken** (*string*) -- Client Token.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'FailedInclusionAnnotations': [
                 {
                     'ProfileId': 'string',
                     'StatisticId': 'string',
                     'FailureReason': 'string'
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **FailedInclusionAnnotations** *(list) --*

          A list of "AnnotationError"'s.

          * *(dict) --*

            A failed annotation.

            * **ProfileId** *(string) --*

              The Profile ID for the failed annotation.

            * **StatisticId** *(string) --*

              The Statistic ID for the failed annotation.

            * **FailureReason** *(string) --*

              The reason why the annotation failed.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / update_partition


update_partition
****************

Glue.Client.update_partition(**kwargs)

   Updates a partition.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionValueList=[
              'string',
          ],
          PartitionInput={
              'Values': [
                  'string',
              ],
              'LastAccessTime': datetime(2015, 1, 1),
              'StorageDescriptor': {
                  'Columns': [
                      {
                          'Name': 'string',
                          'Type': 'string',
                          'Comment': 'string',
                          'Parameters': {
                              'string': 'string'
                          }
                      },
                  ],
                  'Location': 'string',
                  'AdditionalLocations': [
                      'string',
                  ],
                  'InputFormat': 'string',
                  'OutputFormat': 'string',
                  'Compressed': True|False,
                  'NumberOfBuckets': 123,
                  'SerdeInfo': {
                      'Name': 'string',
                      'SerializationLibrary': 'string',
                      'Parameters': {
                          'string': 'string'
                      }
                  },
                  'BucketColumns': [
                      'string',
                  ],
                  'SortColumns': [
                      {
                          'Column': 'string',
                          'SortOrder': 123
                      },
                  ],
                  'Parameters': {
                      'string': 'string'
                  },
                  'SkewedInfo': {
                      'SkewedColumnNames': [
                          'string',
                      ],
                      'SkewedColumnValues': [
                          'string',
                      ],
                      'SkewedColumnValueLocationMaps': {
                          'string': 'string'
                      }
                  },
                  'StoredAsSubDirectories': True|False,
                  'SchemaReference': {
                      'SchemaId': {
                          'SchemaArn': 'string',
                          'SchemaName': 'string',
                          'RegistryName': 'string'
                      },
                      'SchemaVersionId': 'string',
                      'SchemaVersionNumber': 123
                  }
              },
              'Parameters': {
                  'string': 'string'
              },
              'LastAnalyzedTime': datetime(2015, 1, 1)
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partition to be updated resides. If none is provided, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database in which the table in
        question resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table in which the partition to be updated is
        located.

      * **PartitionValueList** (*list*) --

        **[REQUIRED]**

        List of partition key values that define the partition to
        update.

        * *(string) --*

      * **PartitionInput** (*dict*) --

        **[REQUIRED]**

        The new partition object to update the partition to.

        The "Values" property can't be changed. If you want to change
        the partition key values for a partition, delete and recreate
        the partition.

        * **Values** *(list) --*

          The values of the partition. Although this parameter is not
          required by the SDK, you must specify this parameter for a
          valid input.

          The values for the keys for the new partition must be passed
          as an array of String objects that must be ordered in the
          same order as the partition keys appearing in the Amazon S3
          prefix. Otherwise Glue will add the values to the wrong
          keys.

          * *(string) --*

        * **LastAccessTime** *(datetime) --*

          The last time at which the partition was accessed.

        * **StorageDescriptor** *(dict) --*

          Provides information about the physical location where the
          partition is stored.

          * **Columns** *(list) --*

            A list of the "Columns" in the table.

            * *(dict) --*

              A column in a "Table".

              * **Name** *(string) --* **[REQUIRED]**

                The name of the "Column".

              * **Type** *(string) --*

                The data type of the "Column".

              * **Comment** *(string) --*

                A free-form text comment.

              * **Parameters** *(dict) --*

                These key-value pairs define properties associated
                with the column.

                * *(string) --*

                  * *(string) --*

          * **Location** *(string) --*

            The physical location of the table. By default, this takes
            the form of the warehouse location, followed by the
            database location in the warehouse, followed by the table
            name.

          * **AdditionalLocations** *(list) --*

            A list of locations that point to the path where a Delta
            table is located.

            * *(string) --*

          * **InputFormat** *(string) --*

            The input format: "SequenceFileInputFormat" (binary), or
            "TextInputFormat", or a custom format.

          * **OutputFormat** *(string) --*

            The output format: "SequenceFileOutputFormat" (binary), or
            "IgnoreKeyTextOutputFormat", or a custom format.

          * **Compressed** *(boolean) --*

            "True" if the data in the table is compressed, or "False"
            if not.

          * **NumberOfBuckets** *(integer) --*

            Must be specified if the table contains any dimension
            columns.

          * **SerdeInfo** *(dict) --*

            The serialization/deserialization (SerDe) information.

            * **Name** *(string) --*

              Name of the SerDe.

            * **SerializationLibrary** *(string) --*

              Usually the class that implements the SerDe. An example
              is
              "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe".

            * **Parameters** *(dict) --*

              These key-value pairs define initialization parameters
              for the SerDe.

              * *(string) --*

                * *(string) --*

          * **BucketColumns** *(list) --*

            A list of reducer grouping columns, clustering columns,
            and bucketing columns in the table.

            * *(string) --*

          * **SortColumns** *(list) --*

            A list specifying the sort order of each bucket in the
            table.

            * *(dict) --*

              Specifies the sort order of a sorted column.

              * **Column** *(string) --* **[REQUIRED]**

                The name of the column.

              * **SortOrder** *(integer) --* **[REQUIRED]**

                Indicates that the column is sorted in ascending order
                ( "== 1"), or in descending order ( "==0").

          * **Parameters** *(dict) --*

            The user-supplied properties in key-value form.

            * *(string) --*

              * *(string) --*

          * **SkewedInfo** *(dict) --*

            The information about values that appear frequently in a
            column (skewed values).

            * **SkewedColumnNames** *(list) --*

              A list of names of columns that contain skewed values.

              * *(string) --*

            * **SkewedColumnValues** *(list) --*

              A list of values that appear so frequently as to be
              considered skewed.

              * *(string) --*

            * **SkewedColumnValueLocationMaps** *(dict) --*

              A mapping of skewed values to the columns that contain
              them.

              * *(string) --*

                * *(string) --*

          * **StoredAsSubDirectories** *(boolean) --*

            "True" if the table data is stored in subdirectories, or
            "False" if not.

          * **SchemaReference** *(dict) --*

            An object that references a schema stored in the Glue
            Schema Registry.

            When creating a table, you can pass an empty list of
            columns for the schema, and instead use a schema
            reference.

            * **SchemaId** *(dict) --*

              A structure that contains schema identity fields. Either
              this or the "SchemaVersionId" has to be provided.

              * **SchemaArn** *(string) --*

                The Amazon Resource Name (ARN) of the schema. One of
                "SchemaArn" or "SchemaName" has to be provided.

              * **SchemaName** *(string) --*

                The name of the schema. One of "SchemaArn" or
                "SchemaName" has to be provided.

              * **RegistryName** *(string) --*

                The name of the schema registry that contains the
                schema.

            * **SchemaVersionId** *(string) --*

              The unique ID assigned to a version of the schema.
              Either this or the "SchemaId" has to be provided.

            * **SchemaVersionNumber** *(integer) --*

              The version number of the schema.

        * **Parameters** *(dict) --*

          These key-value pairs define partition parameters.

          * *(string) --*

            * *(string) --*

        * **LastAnalyzedTime** *(datetime) --*

          The last time at which column statistics were computed for
          this partition.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / update_column_statistics_task_settings


update_column_statistics_task_settings
**************************************

Glue.Client.update_column_statistics_task_settings(**kwargs)

   Updates settings for a column statistics task.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_column_statistics_task_settings(
          DatabaseName='string',
          TableName='string',
          Role='string',
          Schedule='string',
          ColumnNameList=[
              'string',
          ],
          SampleSize=123.0,
          CatalogID='string',
          SecurityConfiguration='string'
      )

   Parameters:
      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database where the table resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table for which to generate column statistics.

      * **Role** (*string*) -- The role used for running the column
        statistics.

      * **Schedule** (*string*) -- A schedule for running the column
        statistics, specified in CRON syntax.

      * **ColumnNameList** (*list*) --

        A list of column names for which to run statistics.

        * *(string) --*

      * **SampleSize** (*float*) -- The percentage of data to sample.

      * **CatalogID** (*string*) -- The ID of the Data Catalog in
        which the database resides.

      * **SecurityConfiguration** (*string*) -- Name of the security
        configuration that is used to encrypt CloudWatch logs.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.VersionMismatchException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / delete_session


delete_session
**************

Glue.Client.delete_session(**kwargs)

   Deletes the session.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_session(
          Id='string',
          RequestOrigin='string'
      )

   Parameters:
      * **Id** (*string*) --

        **[REQUIRED]**

        The ID of the session to be deleted.

      * **RequestOrigin** (*string*) -- The name of the origin of the
        delete session request.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Id': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Id** *(string) --*

          Returns the ID of the deleted session.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.IllegalSessionStateException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / register_schema_version


register_schema_version
***********************

Glue.Client.register_schema_version(**kwargs)

   Adds a new version to the existing schema. Returns an error if new
   version of schema does not meet the compatibility requirements of
   the schema set. This API will not create a new schema set and will
   return a 404 error if the schema set is not already present in the
   Schema Registry.

   If this is the first schema definition to be registered in the
   Schema Registry, this API will store the schema version and return
   immediately. Otherwise, this call has the potential to run longer
   than other operations due to compatibility modes. You can call the
   "GetSchemaVersion" API with the "SchemaVersionId" to check
   compatibility modes.

   If the same schema definition is already stored in Schema Registry
   as a version, the schema ID of the existing schema is returned to
   the caller.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.register_schema_version(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          SchemaDefinition='string'
      )

   Parameters:
      * **SchemaId** (*dict*) --

        **[REQUIRED]**

        This is a wrapper structure to contain schema identity fields.
        The structure contains:

        * SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the
          schema. Either "SchemaArn" or "SchemaName" and
          "RegistryName" has to be provided.

        * SchemaId$SchemaName: The name of the schema. Either
          "SchemaArn" or "SchemaName" and "RegistryName" has to be
          provided.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **SchemaDefinition** (*string*) --

        **[REQUIRED]**

        The schema definition using the "DataFormat" setting for the
        "SchemaName".

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SchemaVersionId': 'string',
             'VersionNumber': 123,
             'Status': 'AVAILABLE'|'PENDING'|'FAILURE'|'DELETING'
         }

      **Response Structure**

      * *(dict) --*

        * **SchemaVersionId** *(string) --*

          The unique ID that represents the version of this schema.

        * **VersionNumber** *(integer) --*

          The version of this schema (for sync flow only, in case this
          is the first version).

        * **Status** *(string) --*

          The status of the schema version.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / update_job


update_job
**********

Glue.Client.update_job(**kwargs)

   Updates an existing job definition. The previous job definition is
   completely overwritten by this information.

   See also: AWS API Documentation

      **Request Syntax**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

      **Parameters**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobName': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **JobName** *(string) --*

          Returns the name of the updated job definition.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / list_sessions


list_sessions
*************

Glue.Client.list_sessions(**kwargs)

   Retrieve a list of sessions.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_sessions(
          NextToken='string',
          MaxResults=123,
          Tags={
              'string': 'string'
          },
          RequestOrigin='string'
      )

   Parameters:
      * **NextToken** (*string*) -- The token for the next set of
        results, or null if there are no more result.

      * **MaxResults** (*integer*) -- The maximum number of results.

      * **Tags** (*dict*) --

        Tags belonging to the session.

        * *(string) --*

          * *(string) --*

      * **RequestOrigin** (*string*) -- The origin of the request.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Ids': [
                 'string',
             ],
             'Sessions': [
                 {
                     'Id': 'string',
                     'CreatedOn': datetime(2015, 1, 1),
                     'Status': 'PROVISIONING'|'READY'|'FAILED'|'TIMEOUT'|'STOPPING'|'STOPPED',
                     'ErrorMessage': 'string',
                     'Description': 'string',
                     'Role': 'string',
                     'Command': {
                         'Name': 'string',
                         'PythonVersion': 'string'
                     },
                     'DefaultArguments': {
                         'string': 'string'
                     },
                     'Connections': {
                         'Connections': [
                             'string',
                         ]
                     },
                     'Progress': 123.0,
                     'MaxCapacity': 123.0,
                     'SecurityConfiguration': 'string',
                     'GlueVersion': 'string',
                     'NumberOfWorkers': 123,
                     'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                     'CompletedOn': datetime(2015, 1, 1),
                     'ExecutionTime': 123.0,
                     'DPUSeconds': 123.0,
                     'IdleTimeout': 123,
                     'ProfileName': 'string'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Ids** *(list) --*

          Returns the ID of the session.

          * *(string) --*

        * **Sessions** *(list) --*

          Returns the session object.

          * *(dict) --*

            The period in which a remote Spark runtime environment is
            running.

            * **Id** *(string) --*

              The ID of the session.

            * **CreatedOn** *(datetime) --*

              The time and date when the session was created.

            * **Status** *(string) --*

              The session status.

            * **ErrorMessage** *(string) --*

              The error message displayed during the session.

            * **Description** *(string) --*

              The description of the session.

            * **Role** *(string) --*

              The name or Amazon Resource Name (ARN) of the IAM role
              associated with the Session.

            * **Command** *(dict) --*

              The command object.See SessionCommand.

              * **Name** *(string) --*

                Specifies the name of the SessionCommand. Can be
                'glueetl' or 'gluestreaming'.

              * **PythonVersion** *(string) --*

                Specifies the Python version. The Python version
                indicates the version supported for jobs of type
                Spark.

            * **DefaultArguments** *(dict) --*

              A map array of key-value pairs. Max is 75 pairs.

              * *(string) --*

                * *(string) --*

            * **Connections** *(dict) --*

              The number of connections used for the session.

              * **Connections** *(list) --*

                A list of connections used by the job.

                * *(string) --*

            * **Progress** *(float) --*

              The code execution progress of the session.

            * **MaxCapacity** *(float) --*

              The number of Glue data processing units (DPUs) that can
              be allocated when the job runs. A DPU is a relative
              measure of processing power that consists of 4 vCPUs of
              compute capacity and 16 GB memory.

            * **SecurityConfiguration** *(string) --*

              The name of the SecurityConfiguration structure to be
              used with the session.

            * **GlueVersion** *(string) --*

              The Glue version determines the versions of Apache Spark
              and Python that Glue supports. The GlueVersion must be
              greater than 2.0.

            * **NumberOfWorkers** *(integer) --*

              The number of workers of a defined "WorkerType" to use
              for the session.

            * **WorkerType** *(string) --*

              The type of predefined worker that is allocated when a
              session runs. Accepts a value of "G.1X", "G.2X", "G.4X",
              or "G.8X" for Spark sessions. Accepts the value "Z.2X"
              for Ray sessions.

            * **CompletedOn** *(datetime) --*

              The date and time that this session is completed.

            * **ExecutionTime** *(float) --*

              The total time the session ran for.

            * **DPUSeconds** *(float) --*

              The DPUs consumed by the session (formula: ExecutionTime
              * MaxCapacity).

            * **IdleTimeout** *(integer) --*

              The number of minutes when idle before the session times
              out.

            * **ProfileName** *(string) --*

              The name of an Glue usage profile associated with the
              session.

        * **NextToken** *(string) --*

          The token for the next set of results, or null if there are
          no more result.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / batch_delete_table


batch_delete_table
******************

Glue.Client.batch_delete_table(**kwargs)

   Deletes multiple tables at once.

   Note:

     After completing this operation, you no longer have access to the
     table versions and partitions that belong to the deleted table.
     Glue deletes these "orphaned" resources asynchronously in a
     timely manner, at the discretion of the service.To ensure the
     immediate deletion of all related resources, before calling
     "BatchDeleteTable", use "DeleteTableVersion" or
     "BatchDeleteTableVersion", and "DeletePartition" or
     "BatchDeletePartition", to delete any resources that belong to
     the table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_delete_table(
          CatalogId='string',
          DatabaseName='string',
          TablesToDelete=[
              'string',
          ],
          TransactionId='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the table resides. If none is provided, the Amazon Web
        Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database in which the tables to delete
        reside. For Hive compatibility, this name is entirely
        lowercase.

      * **TablesToDelete** (*list*) --

        **[REQUIRED]**

        A list of the table to delete.

        * *(string) --*

      * **TransactionId** (*string*) -- The transaction ID at which to
        delete the table contents.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Errors': [
                 {
                     'TableName': 'string',
                     'ErrorDetail': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Errors** *(list) --*

          A list of errors encountered in attempting to delete the
          specified tables.

          * *(dict) --*

            An error record for table operations.

            * **TableName** *(string) --*

              The name of the table. For Hive compatibility, this must
              be entirely lowercase.

            * **ErrorDetail** *(dict) --*

              The details about the error.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ResourceNotReadyException"
Glue / Client / delete_usage_profile


delete_usage_profile
********************

Glue.Client.delete_usage_profile(**kwargs)

   Deletes the Glue specified usage profile.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_usage_profile(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the usage profile to delete.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.OperationNotSupportedException"
Glue / Client / stop_column_statistics_task_run


stop_column_statistics_task_run
*******************************

Glue.Client.stop_column_statistics_task_run(**kwargs)

   Stops a task run for the specified table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.stop_column_statistics_task_run(
          DatabaseName='string',
          TableName='string'
      )

   Parameters:
      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database where the table resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ColumnStatisticsTaskNotRunningException"

   * "Glue.Client.exceptions.ColumnStatisticsTaskStoppingException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / stop_crawler_schedule


stop_crawler_schedule
*********************

Glue.Client.stop_crawler_schedule(**kwargs)

   Sets the schedule state of the specified crawler to
   "NOT_SCHEDULED", but does not stop the crawler if it is already
   running.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.stop_crawler_schedule(
          CrawlerName='string'
      )

   Parameters:
      **CrawlerName** (*string*) --

      **[REQUIRED]**

      Name of the crawler whose schedule state to set.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.SchedulerNotRunningException"

   * "Glue.Client.exceptions.SchedulerTransitioningException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_ml_transforms


get_ml_transforms
*****************

Glue.Client.get_ml_transforms(**kwargs)

   Gets a sortable, filterable list of existing Glue machine learning
   transforms. Machine learning transforms are a special type of
   transform that use machine learning to learn the details of the
   transformation to be performed by learning from examples provided
   by humans. These transformations are then saved by Glue, and you
   can retrieve their metadata by calling "GetMLTransforms".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_ml_transforms(
          NextToken='string',
          MaxResults=123,
          Filter={
              'Name': 'string',
              'TransformType': 'FIND_MATCHES',
              'Status': 'NOT_READY'|'READY'|'DELETING',
              'GlueVersion': 'string',
              'CreatedBefore': datetime(2015, 1, 1),
              'CreatedAfter': datetime(2015, 1, 1),
              'LastModifiedBefore': datetime(2015, 1, 1),
              'LastModifiedAfter': datetime(2015, 1, 1),
              'Schema': [
                  {
                      'Name': 'string',
                      'DataType': 'string'
                  },
              ]
          },
          Sort={
              'Column': 'NAME'|'TRANSFORM_TYPE'|'STATUS'|'CREATED'|'LAST_MODIFIED',
              'SortDirection': 'DESCENDING'|'ASCENDING'
          }
      )

   Parameters:
      * **NextToken** (*string*) -- A paginated token to offset the
        results.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return.

      * **Filter** (*dict*) --

        The filter transformation criteria.

        * **Name** *(string) --*

          A unique transform name that is used to filter the machine
          learning transforms.

        * **TransformType** *(string) --*

          The type of machine learning transform that is used to
          filter the machine learning transforms.

        * **Status** *(string) --*

          Filters the list of machine learning transforms by the last
          known status of the transforms (to indicate whether a
          transform can be used or not). One of "NOT_READY", "READY",
          or "DELETING".

        * **GlueVersion** *(string) --*

          This value determines which version of Glue this machine
          learning transform is compatible with. Glue 1.0 is
          recommended for most customers. If the value is not set, the
          Glue compatibility defaults to Glue 0.9. For more
          information, see Glue Versions in the developer guide.

        * **CreatedBefore** *(datetime) --*

          The time and date before which the transforms were created.

        * **CreatedAfter** *(datetime) --*

          The time and date after which the transforms were created.

        * **LastModifiedBefore** *(datetime) --*

          Filter on transforms last modified before this date.

        * **LastModifiedAfter** *(datetime) --*

          Filter on transforms last modified after this date.

        * **Schema** *(list) --*

          Filters on datasets with a specific schema. The "Map<Column,
          Type>" object is an array of key-value pairs representing
          the schema this transform accepts, where "Column" is the
          name of a column, and "Type" is the type of the data such as
          an integer or string. Has an upper bound of 100 columns.

          * *(dict) --*

            A key-value pair representing a column and data type that
            this transform can run against. The "Schema" parameter of
            the "MLTransform" may contain up to 100 of these
            structures.

            * **Name** *(string) --*

              The name of the column.

            * **DataType** *(string) --*

              The type of data in the column.

      * **Sort** (*dict*) --

        The sorting criteria.

        * **Column** *(string) --* **[REQUIRED]**

          The column to be used in the sorting criteria that are
          associated with the machine learning transform.

        * **SortDirection** *(string) --* **[REQUIRED]**

          The sort direction to be used in the sorting criteria that
          are associated with the machine learning transform.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Transforms': [
                 {
                     'TransformId': 'string',
                     'Name': 'string',
                     'Description': 'string',
                     'Status': 'NOT_READY'|'READY'|'DELETING',
                     'CreatedOn': datetime(2015, 1, 1),
                     'LastModifiedOn': datetime(2015, 1, 1),
                     'InputRecordTables': [
                         {
                             'DatabaseName': 'string',
                             'TableName': 'string',
                             'CatalogId': 'string',
                             'ConnectionName': 'string',
                             'AdditionalOptions': {
                                 'string': 'string'
                             }
                         },
                     ],
                     'Parameters': {
                         'TransformType': 'FIND_MATCHES',
                         'FindMatchesParameters': {
                             'PrimaryKeyColumnName': 'string',
                             'PrecisionRecallTradeoff': 123.0,
                             'AccuracyCostTradeoff': 123.0,
                             'EnforceProvidedLabels': True|False
                         }
                     },
                     'EvaluationMetrics': {
                         'TransformType': 'FIND_MATCHES',
                         'FindMatchesMetrics': {
                             'AreaUnderPRCurve': 123.0,
                             'Precision': 123.0,
                             'Recall': 123.0,
                             'F1': 123.0,
                             'ConfusionMatrix': {
                                 'NumTruePositives': 123,
                                 'NumFalsePositives': 123,
                                 'NumTrueNegatives': 123,
                                 'NumFalseNegatives': 123
                             },
                             'ColumnImportances': [
                                 {
                                     'ColumnName': 'string',
                                     'Importance': 123.0
                                 },
                             ]
                         }
                     },
                     'LabelCount': 123,
                     'Schema': [
                         {
                             'Name': 'string',
                             'DataType': 'string'
                         },
                     ],
                     'Role': 'string',
                     'GlueVersion': 'string',
                     'MaxCapacity': 123.0,
                     'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                     'NumberOfWorkers': 123,
                     'Timeout': 123,
                     'MaxRetries': 123,
                     'TransformEncryption': {
                         'MlUserDataEncryption': {
                             'MlUserDataEncryptionMode': 'DISABLED'|'SSE-KMS',
                             'KmsKeyId': 'string'
                         },
                         'TaskRunSecurityConfigurationName': 'string'
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Transforms** *(list) --*

          A list of machine learning transforms.

          * *(dict) --*

            A structure for a machine learning transform.

            * **TransformId** *(string) --*

              The unique transform ID that is generated for the
              machine learning transform. The ID is guaranteed to be
              unique and does not change.

            * **Name** *(string) --*

              A user-defined name for the machine learning transform.
              Names are not guaranteed unique and can be changed at
              any time.

            * **Description** *(string) --*

              A user-defined, long-form description text for the
              machine learning transform. Descriptions are not
              guaranteed to be unique and can be changed at any time.

            * **Status** *(string) --*

              The current status of the machine learning transform.

            * **CreatedOn** *(datetime) --*

              A timestamp. The time and date that this machine
              learning transform was created.

            * **LastModifiedOn** *(datetime) --*

              A timestamp. The last point in time when this machine
              learning transform was modified.

            * **InputRecordTables** *(list) --*

              A list of Glue table definitions used by the transform.

              * *(dict) --*

                The database and table in the Glue Data Catalog that
                is used for input or output data.

                * **DatabaseName** *(string) --*

                  A database name in the Glue Data Catalog.

                * **TableName** *(string) --*

                  A table name in the Glue Data Catalog.

                * **CatalogId** *(string) --*

                  A unique identifier for the Glue Data Catalog.

                * **ConnectionName** *(string) --*

                  The name of the connection to the Glue Data Catalog.

                * **AdditionalOptions** *(dict) --*

                  Additional options for the table. Currently there
                  are two keys supported:

                  * "pushDownPredicate": to filter on partitions
                    without having to list and read all the files in
                    your dataset.

                  * "catalogPartitionPredicate": to use server-side
                    partition pruning using partition indexes in the
                    Glue Data Catalog.

                  * *(string) --*

                    * *(string) --*

            * **Parameters** *(dict) --*

              A "TransformParameters" object. You can use parameters
              to tune (customize) the behavior of the machine learning
              transform by specifying what data it learns from and
              your preference on various tradeoffs (such as precious
              vs. recall, or accuracy vs. cost).

              * **TransformType** *(string) --*

                The type of machine learning transform.

                For information about the types of machine learning
                transforms, see Creating Machine Learning Transforms.

              * **FindMatchesParameters** *(dict) --*

                The parameters for the find matches algorithm.

                * **PrimaryKeyColumnName** *(string) --*

                  The name of a column that uniquely identifies rows
                  in the source table. Used to help identify matching
                  records.

                * **PrecisionRecallTradeoff** *(float) --*

                  The value selected when tuning your transform for a
                  balance between precision and recall. A value of 0.5
                  means no preference; a value of 1.0 means a bias
                  purely for precision, and a value of 0.0 means a
                  bias for recall. Because this is a tradeoff,
                  choosing values close to 1.0 means very low recall,
                  and choosing values close to 0.0 results in very low
                  precision.

                  The precision metric indicates how often your model
                  is correct when it predicts a match.

                  The recall metric indicates that for an actual
                  match, how often your model predicts the match.

                * **AccuracyCostTradeoff** *(float) --*

                  The value that is selected when tuning your
                  transform for a balance between accuracy and cost. A
                  value of 0.5 means that the system balances accuracy
                  and cost concerns. A value of 1.0 means a bias
                  purely for accuracy, which typically results in a
                  higher cost, sometimes substantially higher. A value
                  of 0.0 means a bias purely for cost, which results
                  in a less accurate "FindMatches" transform,
                  sometimes with unacceptable accuracy.

                  Accuracy measures how well the transform finds true
                  positives and true negatives. Increasing accuracy
                  requires more machine resources and cost. But it
                  also results in increased recall.

                  Cost measures how many compute resources, and thus
                  money, are consumed to run the transform.

                * **EnforceProvidedLabels** *(boolean) --*

                  The value to switch on or off to force the output to
                  match the provided labels from users. If the value
                  is "True", the "find matches" transform forces the
                  output to match the provided labels. The results
                  override the normal conflation results. If the value
                  is "False", the "find matches" transform does not
                  ensure all the labels provided are respected, and
                  the results rely on the trained model.

                  Note that setting this value to true may increase
                  the conflation execution time.

            * **EvaluationMetrics** *(dict) --*

              An "EvaluationMetrics" object. Evaluation metrics
              provide an estimate of the quality of your machine
              learning transform.

              * **TransformType** *(string) --*

                The type of machine learning transform.

              * **FindMatchesMetrics** *(dict) --*

                The evaluation metrics for the find matches algorithm.

                * **AreaUnderPRCurve** *(float) --*

                  The area under the precision/recall curve (AUPRC) is
                  a single number measuring the overall quality of the
                  transform, that is independent of the choice made
                  for precision vs. recall. Higher values indicate
                  that you have a more attractive precision vs. recall
                  tradeoff.

                  For more information, see Precision and recall in
                  Wikipedia.

                * **Precision** *(float) --*

                  The precision metric indicates when often your
                  transform is correct when it predicts a match.
                  Specifically, it measures how well the transform
                  finds true positives from the total true positives
                  possible.

                  For more information, see Precision and recall in
                  Wikipedia.

                * **Recall** *(float) --*

                  The recall metric indicates that for an actual
                  match, how often your transform predicts the match.
                  Specifically, it measures how well the transform
                  finds true positives from the total records in the
                  source data.

                  For more information, see Precision and recall in
                  Wikipedia.

                * **F1** *(float) --*

                  The maximum F1 metric indicates the transform's
                  accuracy between 0 and 1, where 1 is the best
                  accuracy.

                  For more information, see F1 score in Wikipedia.

                * **ConfusionMatrix** *(dict) --*

                  The confusion matrix shows you what your transform
                  is predicting accurately and what types of errors it
                  is making.

                  For more information, see Confusion matrix in
                  Wikipedia.

                  * **NumTruePositives** *(integer) --*

                    The number of matches in the data that the
                    transform correctly found, in the confusion matrix
                    for your transform.

                  * **NumFalsePositives** *(integer) --*

                    The number of nonmatches in the data that the
                    transform incorrectly classified as a match, in
                    the confusion matrix for your transform.

                  * **NumTrueNegatives** *(integer) --*

                    The number of nonmatches in the data that the
                    transform correctly rejected, in the confusion
                    matrix for your transform.

                  * **NumFalseNegatives** *(integer) --*

                    The number of matches in the data that the
                    transform didn't find, in the confusion matrix for
                    your transform.

                * **ColumnImportances** *(list) --*

                  A list of "ColumnImportance" structures containing
                  column importance metrics, sorted in order of
                  descending importance.

                  * *(dict) --*

                    A structure containing the column name and column
                    importance score for a column.

                    Column importance helps you understand how columns
                    contribute to your model, by identifying which
                    columns in your records are more important than
                    others.

                    * **ColumnName** *(string) --*

                      The name of a column.

                    * **Importance** *(float) --*

                      The column importance score for the column, as a
                      decimal.

            * **LabelCount** *(integer) --*

              A count identifier for the labeling files generated by
              Glue for this transform. As you create a better
              transform, you can iteratively download, label, and
              upload the labeling file.

            * **Schema** *(list) --*

              A map of key-value pairs representing the columns and
              data types that this transform can run against. Has an
              upper bound of 100 columns.

              * *(dict) --*

                A key-value pair representing a column and data type
                that this transform can run against. The "Schema"
                parameter of the "MLTransform" may contain up to 100
                of these structures.

                * **Name** *(string) --*

                  The name of the column.

                * **DataType** *(string) --*

                  The type of data in the column.

            * **Role** *(string) --*

              The name or Amazon Resource Name (ARN) of the IAM role
              with the required permissions. The required permissions
              include both Glue service role permissions to Glue
              resources, and Amazon S3 permissions required by the
              transform.

              * This role needs Glue service role permissions to allow
                access to resources in Glue. See Attach a Policy to
                IAM Users That Access Glue.

              * This role needs permission to your Amazon Simple
                Storage Service (Amazon S3) sources, targets,
                temporary directory, scripts, and any libraries used
                by the task run for this transform.

            * **GlueVersion** *(string) --*

              This value determines which version of Glue this machine
              learning transform is compatible with. Glue 1.0 is
              recommended for most customers. If the value is not set,
              the Glue compatibility defaults to Glue 0.9. For more
              information, see Glue Versions in the developer guide.

            * **MaxCapacity** *(float) --*

              The number of Glue data processing units (DPUs) that are
              allocated to task runs for this transform. You can
              allocate from 2 to 100 DPUs; the default is 10. A DPU is
              a relative measure of processing power that consists of
              4 vCPUs of compute capacity and 16 GB of memory. For
              more information, see the Glue pricing page.

              "MaxCapacity" is a mutually exclusive option with
              "NumberOfWorkers" and "WorkerType".

              * If either "NumberOfWorkers" or "WorkerType" is set,
                then "MaxCapacity" cannot be set.

              * If "MaxCapacity" is set then neither "NumberOfWorkers"
                or "WorkerType" can be set.

              * If "WorkerType" is set, then "NumberOfWorkers" is
                required (and vice versa).

              * "MaxCapacity" and "NumberOfWorkers" must both be at
                least 1.

              When the "WorkerType" field is set to a value other than
              "Standard", the "MaxCapacity" field is set automatically
              and becomes read-only.

            * **WorkerType** *(string) --*

              The type of predefined worker that is allocated when a
              task of this transform runs. Accepts a value of
              Standard, G.1X, or G.2X.

              * For the "Standard" worker type, each worker provides 4
                vCPU, 16 GB of memory and a 50GB disk, and 2 executors
                per worker.

              * For the "G.1X" worker type, each worker provides 4
                vCPU, 16 GB of memory and a 64GB disk, and 1 executor
                per worker.

              * For the "G.2X" worker type, each worker provides 8
                vCPU, 32 GB of memory and a 128GB disk, and 1 executor
                per worker.

              "MaxCapacity" is a mutually exclusive option with
              "NumberOfWorkers" and "WorkerType".

              * If either "NumberOfWorkers" or "WorkerType" is set,
                then "MaxCapacity" cannot be set.

              * If "MaxCapacity" is set then neither "NumberOfWorkers"
                or "WorkerType" can be set.

              * If "WorkerType" is set, then "NumberOfWorkers" is
                required (and vice versa).

              * "MaxCapacity" and "NumberOfWorkers" must both be at
                least 1.

            * **NumberOfWorkers** *(integer) --*

              The number of workers of a defined "workerType" that are
              allocated when a task of the transform runs.

              If "WorkerType" is set, then "NumberOfWorkers" is
              required (and vice versa).

            * **Timeout** *(integer) --*

              The timeout in minutes of the machine learning
              transform.

            * **MaxRetries** *(integer) --*

              The maximum number of times to retry after an
              "MLTaskRun" of the machine learning transform fails.

            * **TransformEncryption** *(dict) --*

              The encryption-at-rest settings of the transform that
              apply to accessing user data. Machine learning
              transforms can access user data encrypted in Amazon S3
              using KMS.

              * **MlUserDataEncryption** *(dict) --*

                An "MLUserDataEncryption" object containing the
                encryption mode and customer-provided KMS key ID.

                * **MlUserDataEncryptionMode** *(string) --*

                  The encryption mode applied to user data. Valid
                  values are:

                  * DISABLED: encryption is disabled

                  * SSEKMS: use of server-side encryption with Key
                    Management Service (SSE-KMS) for user data stored
                    in Amazon S3.

                * **KmsKeyId** *(string) --*

                  The ID for the customer-provided KMS key.

              * **TaskRunSecurityConfigurationName** *(string) --*

                The name of the security configuration.

        * **NextToken** *(string) --*

          A pagination token, if more results are available.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / describe_integrations


describe_integrations
*********************

Glue.Client.describe_integrations(**kwargs)

   The API is used to retrieve a list of integrations.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.describe_integrations(
          IntegrationIdentifier='string',
          Marker='string',
          MaxRecords=123,
          Filters=[
              {
                  'Name': 'string',
                  'Values': [
                      'string',
                  ]
              },
          ]
      )

   Parameters:
      * **IntegrationIdentifier** (*string*) -- The Amazon Resource
        Name (ARN) for the integration.

      * **Marker** (*string*) -- A value that indicates the starting
        point for the next set of response records in a subsequent
        request.

      * **MaxRecords** (*integer*) -- The total number of items to
        return in the output.

      * **Filters** (*list*) --

        A list of key and values, to filter down the results.
        Supported keys are "Status", "IntegrationName", and
        "SourceArn". IntegrationName is limited to only one value.

        * *(dict) --*

          A filter that can be used when invoking a
          "DescribeIntegrations" request.

          * **Name** *(string) --*

            The name of the filter.

          * **Values** *(list) --*

            A list of filter values.

            * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Integrations': [
                 {
                     'SourceArn': 'string',
                     'TargetArn': 'string',
                     'Description': 'string',
                     'IntegrationName': 'string',
                     'IntegrationArn': 'string',
                     'KmsKeyId': 'string',
                     'AdditionalEncryptionContext': {
                         'string': 'string'
                     },
                     'Tags': [
                         {
                             'key': 'string',
                             'value': 'string'
                         },
                     ],
                     'Status': 'CREATING'|'ACTIVE'|'MODIFYING'|'FAILED'|'DELETING'|'SYNCING'|'NEEDS_ATTENTION',
                     'CreateTime': datetime(2015, 1, 1),
                     'IntegrationConfig': {
                         'RefreshInterval': 'string',
                         'SourceProperties': {
                             'string': 'string'
                         }
                     },
                     'Errors': [
                         {
                             'ErrorCode': 'string',
                             'ErrorMessage': 'string'
                         },
                     ],
                     'DataFilter': 'string'
                 },
             ],
             'Marker': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Integrations** *(list) --*

          A list of zero-ETL integrations.

          * *(dict) --*

            Describes a zero-ETL integration.

            * **SourceArn** *(string) --*

              The ARN for the source of the integration.

            * **TargetArn** *(string) --*

              The ARN for the target of the integration.

            * **Description** *(string) --*

              A description for the integration.

            * **IntegrationName** *(string) --*

              A unique name for the integration.

            * **IntegrationArn** *(string) --*

              The Amazon Resource Name (ARN) for the integration.

            * **KmsKeyId** *(string) --*

              The ARN of a KMS key used for encrypting the channel.

            * **AdditionalEncryptionContext** *(dict) --*

              An optional set of non-secret key–value pairs that
              contains additional contextual information for
              encryption. This can only be provided if "KMSKeyId" is
              provided.

              * *(string) --*

                * *(string) --*

            * **Tags** *(list) --*

              Metadata assigned to the resource consisting of a list
              of key-value pairs.

              * *(dict) --*

                The "Tag" object represents a label that you can
                assign to an Amazon Web Services resource. Each tag
                consists of a key and an optional value, both of which
                you define.

                For more information about tags, and controlling
                access to resources in Glue, see Amazon Web Services
                Tags in Glue and Specifying Glue Resource ARNs in the
                developer guide.

                * **key** *(string) --*

                  The tag key. The key is required when you create a
                  tag on an object. The key is case-sensitive, and
                  must not contain the prefix aws.

                * **value** *(string) --*

                  The tag value. The value is optional when you create
                  a tag on an object. The value is case-sensitive, and
                  must not contain the prefix aws.

            * **Status** *(string) --*

              The possible statuses are:

              * CREATING: The integration is being created.

              * ACTIVE: The integration creation succeeds.

              * MODIFYING: The integration is being modified.

              * FAILED: The integration creation fails.

              * DELETING: The integration is deleted.

              * SYNCING: The integration is synchronizing.

              * NEEDS_ATTENTION: The integration needs attention, such
                as synchronization.

            * **CreateTime** *(datetime) --*

              The time that the integration was created, in UTC.

            * **IntegrationConfig** *(dict) --*

              Properties associated with the integration.

              * **RefreshInterval** *(string) --*

                Specifies the frequency at which CDC (Change Data
                Capture) pulls or incremental loads should occur. This
                parameter provides flexibility to align the refresh
                rate with your specific data update patterns, system
                load considerations, and performance optimization
                goals. Time increment can be set from 15 minutes to
                8640 minutes (six days). Currently supports creation
                of "RefreshInterval" only.

              * **SourceProperties** *(dict) --*

                A collection of key-value pairs that specify
                additional properties for the integration source.
                These properties provide configuration options that
                can be used to customize the behavior of the ODB
                source during data integration operations.

                * *(string) --*

                  * *(string) --*

            * **Errors** *(list) --*

              A list of errors associated with the integration.

              * *(dict) --*

                An error associated with a zero-ETL integration.

                * **ErrorCode** *(string) --*

                  The code associated with this error.

                * **ErrorMessage** *(string) --*

                  A message describing the error.

            * **DataFilter** *(string) --*

              Selects source tables for the integration using Maxwell
              filter syntax.

        * **Marker** *(string) --*

          A value that indicates the starting point for the next set
          of response records in a subsequent request.

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.IntegrationNotFoundFault"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / start_job_run


start_job_run
*************

Glue.Client.start_job_run(**kwargs)

   Starts a job run using a job definition.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_job_run(
          JobName='string',
          JobRunQueuingEnabled=True|False,
          JobRunId='string',
          Arguments={
              'string': 'string'
          },
          AllocatedCapacity=123,
          Timeout=123,
          MaxCapacity=123.0,
          SecurityConfiguration='string',
          NotificationProperty={
              'NotifyDelayAfter': 123
          },
          WorkerType='Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
          NumberOfWorkers=123,
          ExecutionClass='FLEX'|'STANDARD',
          ExecutionRoleSessionPolicy='string'
      )

   Parameters:
      * **JobName** (*string*) --

        **[REQUIRED]**

        The name of the job definition to use.

      * **JobRunQueuingEnabled** (*boolean*) --

        Specifies whether job run queuing is enabled for the job run.

        A value of true means job run queuing is enabled for the job
        run. If false or not populated, the job run will not be
        considered for queueing.

      * **JobRunId** (*string*) -- The ID of a previous "JobRun" to
        retry.

      * **Arguments** (*dict*) --

        The job arguments associated with this run. For this job run,
        they replace the default arguments set in the job definition
        itself.

        You can specify arguments here that your own job-execution
        script consumes, as well as arguments that Glue itself
        consumes.

        Job arguments may be logged. Do not pass plaintext secrets as
        arguments. Retrieve secrets from a Glue Connection, Secrets
        Manager or other secret management mechanism if you intend to
        keep them within the Job.

        For information about how to specify and consume your own Job
        arguments, see the Calling Glue APIs in Python topic in the
        developer guide.

        For information about the arguments you can provide to this
        field when configuring Spark jobs, see the Special Parameters
        Used by Glue topic in the developer guide.

        For information about the arguments you can provide to this
        field when configuring Ray jobs, see Using job parameters in
        Ray jobs in the developer guide.

        * *(string) --*

          * *(string) --*

      * **AllocatedCapacity** (*integer*) --

        This field is deprecated. Use "MaxCapacity" instead.

        The number of Glue data processing units (DPUs) to allocate to
        this JobRun. You can allocate a minimum of 2 DPUs; the default
        is 10. A DPU is a relative measure of processing power that
        consists of 4 vCPUs of compute capacity and 16 GB of memory.
        For more information, see the Glue pricing page.

      * **Timeout** (*integer*) --

        The "JobRun" timeout in minutes. This is the maximum time that
        a job run can consume resources before it is terminated and
        enters "TIMEOUT" status. This value overrides the timeout
        value set in the parent job.

        Jobs must have timeout values less than 7 days or 10080
        minutes. Otherwise, the jobs will throw an exception.

        When the value is left blank, the timeout is defaulted to 2880
        minutes.

        Any existing Glue jobs that had a timeout value greater than 7
        days will be defaulted to 7 days. For instance if you have
        specified a timeout of 20 days for a batch job, it will be
        stopped on the 7th day.

        For streaming jobs, if you have set up a maintenance window,
        it will be restarted during the maintenance window after 7
        days.

      * **MaxCapacity** (*float*) --

        For Glue version 1.0 or earlier jobs, using the standard
        worker type, the number of Glue data processing units (DPUs)
        that can be allocated when this job runs. A DPU is a relative
        measure of processing power that consists of 4 vCPUs of
        compute capacity and 16 GB of memory. For more information,
        see the Glue pricing page.

        For Glue version 2.0+ jobs, you cannot specify a "Maximum
        capacity". Instead, you should specify a "Worker type" and the
        "Number of workers".

        Do not set "MaxCapacity" if using "WorkerType" and
        "NumberOfWorkers".

        The value that can be allocated for "MaxCapacity" depends on
        whether you are running a Python shell job, an Apache Spark
        ETL job, or an Apache Spark streaming ETL job:

        * When you specify a Python shell job (
          >>``<<JobCommand.Name``="pythonshell"), you can allocate
          either 0.0625 or 1 DPU. The default is 0.0625 DPU.

        * When you specify an Apache Spark ETL job (
          >>``<<JobCommand.Name``="glueetl") or Apache Spark streaming
          ETL job ( >>``<<JobCommand.Name``="gluestreaming"), you can
          allocate from 2 to 100 DPUs. The default is 10 DPUs. This
          job type cannot have a fractional DPU allocation.

      * **SecurityConfiguration** (*string*) -- The name of the
        "SecurityConfiguration" structure to be used with this job
        run.

      * **NotificationProperty** (*dict*) --

        Specifies configuration properties of a job run notification.

        * **NotifyDelayAfter** *(integer) --*

          After a job run starts, the number of minutes to wait before
          sending a job run delay notification.

      * **WorkerType** (*string*) --

        The type of predefined worker that is allocated when a job
        runs. Accepts a value of G.1X, G.2X, G.4X, G.8X or G.025X for
        Spark jobs. Accepts the value Z.2X for Ray jobs.

        * For the "G.1X" worker type, each worker maps to 1 DPU (4
          vCPUs, 16 GB of memory) with 94GB disk, and provides 1
          executor per worker. We recommend this worker type for
          workloads such as data transforms, joins, and queries, to
          offers a scalable and cost effective way to run most jobs.

        * For the "G.2X" worker type, each worker maps to 2 DPU (8
          vCPUs, 32 GB of memory) with 138GB disk, and provides 1
          executor per worker. We recommend this worker type for
          workloads such as data transforms, joins, and queries, to
          offers a scalable and cost effective way to run most jobs.

        * For the "G.4X" worker type, each worker maps to 4 DPU (16
          vCPUs, 64 GB of memory) with 256GB disk, and provides 1
          executor per worker. We recommend this worker type for jobs
          whose workloads contain your most demanding transforms,
          aggregations, joins, and queries. This worker type is
          available only for Glue version 3.0 or later Spark ETL jobs
          in the following Amazon Web Services Regions: US East
          (Ohio), US East (N. Virginia), US West (Oregon), Asia
          Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific
          (Tokyo), Canada (Central), Europe (Frankfurt), Europe
          (Ireland), and Europe (Stockholm).

        * For the "G.8X" worker type, each worker maps to 8 DPU (32
          vCPUs, 128 GB of memory) with 512GB disk, and provides 1
          executor per worker. We recommend this worker type for jobs
          whose workloads contain your most demanding transforms,
          aggregations, joins, and queries. This worker type is
          available only for Glue version 3.0 or later Spark ETL jobs,
          in the same Amazon Web Services Regions as supported for the
          "G.4X" worker type.

        * For the "G.025X" worker type, each worker maps to 0.25 DPU
          (2 vCPUs, 4 GB of memory) with 84GB disk, and provides 1
          executor per worker. We recommend this worker type for low
          volume streaming jobs. This worker type is only available
          for Glue version 3.0 or later streaming jobs.

        * For the "Z.2X" worker type, each worker maps to 2 M-DPU
          (8vCPUs, 64 GB of memory) with 128 GB disk, and provides up
          to 8 Ray workers based on the autoscaler.

      * **NumberOfWorkers** (*integer*) -- The number of workers of a
        defined "workerType" that are allocated when a job runs.

      * **ExecutionClass** (*string*) --

        Indicates whether the job is run with a standard or flexible
        execution class. The standard execution-class is ideal for
        time-sensitive workloads that require fast job startup and
        dedicated resources.

        The flexible execution class is appropriate for time-
        insensitive jobs whose start and completion times may vary.

        Only jobs with Glue version 3.0 and above and command type
        "glueetl" will be allowed to set "ExecutionClass" to "FLEX".
        The flexible execution class is available for Spark jobs.

      * **ExecutionRoleSessionPolicy** (*string*) -- This inline
        session policy to the StartJobRun API allows you to
        dynamically restrict the permissions of the specified
        execution role for the scope of the job, without requiring the
        creation of additional IAM roles.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobRunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **JobRunId** *(string) --*

          The ID assigned to this job run.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentRunsExceededException"
Glue / Client / create_custom_entity_type


create_custom_entity_type
*************************

Glue.Client.create_custom_entity_type(**kwargs)

   Creates a custom pattern that is used to detect sensitive data
   across the columns and rows of your structured data.

   Each custom pattern you create specifies a regular expression and
   an optional list of context words. If no context words are passed
   only a regular expression is checked.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_custom_entity_type(
          Name='string',
          RegexString='string',
          ContextWords=[
              'string',
          ],
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        A name for the custom pattern that allows it to be retrieved
        or deleted later. This name must be unique per Amazon Web
        Services account.

      * **RegexString** (*string*) --

        **[REQUIRED]**

        A regular expression string that is used for detecting
        sensitive data in a custom pattern.

      * **ContextWords** (*list*) --

        A list of context words. If none of these context words are
        found within the vicinity of the regular expression the data
        will not be detected as sensitive data.

        If no context words are passed only a regular expression is
        checked.

        * *(string) --*

      * **Tags** (*dict*) --

        A list of tags applied to the custom entity type.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the custom pattern you created.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.IdempotentParameterMismatchException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / stop_workflow_run


stop_workflow_run
*****************

Glue.Client.stop_workflow_run(**kwargs)

   Stops the execution of the specified workflow run.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.stop_workflow_run(
          Name='string',
          RunId='string'
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the workflow to stop.

      * **RunId** (*string*) --

        **[REQUIRED]**

        The ID of the workflow run to stop.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.IllegalWorkflowStateException"
Glue / Client / update_crawler


update_crawler
**************

Glue.Client.update_crawler(**kwargs)

   Updates a crawler. If a crawler is running, you must stop it using
   "StopCrawler" before updating it.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_crawler(
          Name='string',
          Role='string',
          DatabaseName='string',
          Description='string',
          Targets={
              'S3Targets': [
                  {
                      'Path': 'string',
                      'Exclusions': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'SampleSize': 123,
                      'EventQueueArn': 'string',
                      'DlqEventQueueArn': 'string'
                  },
              ],
              'JdbcTargets': [
                  {
                      'ConnectionName': 'string',
                      'Path': 'string',
                      'Exclusions': [
                          'string',
                      ],
                      'EnableAdditionalMetadata': [
                          'COMMENTS'|'RAWTYPES',
                      ]
                  },
              ],
              'MongoDBTargets': [
                  {
                      'ConnectionName': 'string',
                      'Path': 'string',
                      'ScanAll': True|False
                  },
              ],
              'DynamoDBTargets': [
                  {
                      'Path': 'string',
                      'scanAll': True|False,
                      'scanRate': 123.0
                  },
              ],
              'CatalogTargets': [
                  {
                      'DatabaseName': 'string',
                      'Tables': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'EventQueueArn': 'string',
                      'DlqEventQueueArn': 'string'
                  },
              ],
              'DeltaTargets': [
                  {
                      'DeltaTables': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'WriteManifest': True|False,
                      'CreateNativeDeltaTable': True|False
                  },
              ],
              'IcebergTargets': [
                  {
                      'Paths': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'Exclusions': [
                          'string',
                      ],
                      'MaximumTraversalDepth': 123
                  },
              ],
              'HudiTargets': [
                  {
                      'Paths': [
                          'string',
                      ],
                      'ConnectionName': 'string',
                      'Exclusions': [
                          'string',
                      ],
                      'MaximumTraversalDepth': 123
                  },
              ]
          },
          Schedule='string',
          Classifiers=[
              'string',
          ],
          TablePrefix='string',
          SchemaChangePolicy={
              'UpdateBehavior': 'LOG'|'UPDATE_IN_DATABASE',
              'DeleteBehavior': 'LOG'|'DELETE_FROM_DATABASE'|'DEPRECATE_IN_DATABASE'
          },
          RecrawlPolicy={
              'RecrawlBehavior': 'CRAWL_EVERYTHING'|'CRAWL_NEW_FOLDERS_ONLY'|'CRAWL_EVENT_MODE'
          },
          LineageConfiguration={
              'CrawlerLineageSettings': 'ENABLE'|'DISABLE'
          },
          LakeFormationConfiguration={
              'UseLakeFormationCredentials': True|False,
              'AccountId': 'string'
          },
          Configuration='string',
          CrawlerSecurityConfiguration='string'
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        Name of the new crawler.

      * **Role** (*string*) -- The IAM role or Amazon Resource Name
        (ARN) of an IAM role that is used by the new crawler to access
        customer resources.

      * **DatabaseName** (*string*) -- The Glue database where results
        are stored, such as: "arn:aws:daylight:us-
        east-1::database/sometable/*".

      * **Description** (*string*) -- A description of the new
        crawler.

      * **Targets** (*dict*) --

        A list of targets to crawl.

        * **S3Targets** *(list) --*

          Specifies Amazon Simple Storage Service (Amazon S3) targets.

          * *(dict) --*

            Specifies a data store in Amazon Simple Storage Service
            (Amazon S3).

            * **Path** *(string) --*

              The path to the Amazon S3 target.

            * **Exclusions** *(list) --*

              A list of glob patterns used to exclude from the crawl.
              For more information, see Catalog Tables with a Crawler.

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of a connection which allows a job or crawler
              to access data in Amazon S3 within an Amazon Virtual
              Private Cloud environment (Amazon VPC).

            * **SampleSize** *(integer) --*

              Sets the number of files in each leaf folder to be
              crawled when crawling sample files in a dataset. If not
              set, all the files are crawled. A valid value is an
              integer between 1 and 249.

            * **EventQueueArn** *(string) --*

              A valid Amazon SQS ARN. For example,
              "arn:aws:sqs:region:account:sqs".

            * **DlqEventQueueArn** *(string) --*

              A valid Amazon dead-letter SQS ARN. For example,
              "arn:aws:sqs:region:account:deadLetterQueue".

        * **JdbcTargets** *(list) --*

          Specifies JDBC targets.

          * *(dict) --*

            Specifies a JDBC data store to crawl.

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the JDBC
              target.

            * **Path** *(string) --*

              The path of the JDBC target.

            * **Exclusions** *(list) --*

              A list of glob patterns used to exclude from the crawl.
              For more information, see Catalog Tables with a Crawler.

              * *(string) --*

            * **EnableAdditionalMetadata** *(list) --*

              Specify a value of "RAWTYPES" or "COMMENTS" to enable
              additional metadata in table responses. "RAWTYPES"
              provides the native-level datatype. "COMMENTS" provides
              comments associated with a column or table in the
              database.

              If you do not need additional metadata, keep the field
              empty.

              * *(string) --*

        * **MongoDBTargets** *(list) --*

          Specifies Amazon DocumentDB or MongoDB targets.

          * *(dict) --*

            Specifies an Amazon DocumentDB or MongoDB data store to
            crawl.

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the
              Amazon DocumentDB or MongoDB target.

            * **Path** *(string) --*

              The path of the Amazon DocumentDB or MongoDB target
              (database/collection).

            * **ScanAll** *(boolean) --*

              Indicates whether to scan all the records, or to sample
              rows from the table. Scanning all the records can take a
              long time when the table is not a high throughput table.

              A value of "true" means to scan all records, while a
              value of "false" means to sample the records. If no
              value is specified, the value defaults to "true".

        * **DynamoDBTargets** *(list) --*

          Specifies Amazon DynamoDB targets.

          * *(dict) --*

            Specifies an Amazon DynamoDB table to crawl.

            * **Path** *(string) --*

              The name of the DynamoDB table to crawl.

            * **scanAll** *(boolean) --*

              Indicates whether to scan all the records, or to sample
              rows from the table. Scanning all the records can take a
              long time when the table is not a high throughput table.

              A value of "true" means to scan all records, while a
              value of "false" means to sample the records. If no
              value is specified, the value defaults to "true".

            * **scanRate** *(float) --*

              The percentage of the configured read capacity units to
              use by the Glue crawler. Read capacity units is a term
              defined by DynamoDB, and is a numeric value that acts as
              rate limiter for the number of reads that can be
              performed on that table per second.

              The valid values are null or a value between 0.1 to 1.5.
              A null value is used when user does not provide a value,
              and defaults to 0.5 of the configured Read Capacity Unit
              (for provisioned tables), or 0.25 of the max configured
              Read Capacity Unit (for tables using on-demand mode).

        * **CatalogTargets** *(list) --*

          Specifies Glue Data Catalog targets.

          * *(dict) --*

            Specifies an Glue Data Catalog target.

            * **DatabaseName** *(string) --* **[REQUIRED]**

              The name of the database to be synchronized.

            * **Tables** *(list) --* **[REQUIRED]**

              A list of the tables to be synchronized.

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of the connection for an Amazon S3-backed Data
              Catalog table to be a target of the crawl when using a
              "Catalog" connection type paired with a "NETWORK"
              Connection type.

            * **EventQueueArn** *(string) --*

              A valid Amazon SQS ARN. For example,
              "arn:aws:sqs:region:account:sqs".

            * **DlqEventQueueArn** *(string) --*

              A valid Amazon dead-letter SQS ARN. For example,
              "arn:aws:sqs:region:account:deadLetterQueue".

        * **DeltaTargets** *(list) --*

          Specifies Delta data store targets.

          * *(dict) --*

            Specifies a Delta data store to crawl one or more Delta
            tables.

            * **DeltaTables** *(list) --*

              A list of the Amazon S3 paths to the Delta tables.

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the
              Delta table target.

            * **WriteManifest** *(boolean) --*

              Specifies whether to write the manifest files to the
              Delta table path.

            * **CreateNativeDeltaTable** *(boolean) --*

              Specifies whether the crawler will create native tables,
              to allow integration with query engines that support
              querying of the Delta transaction log directly.

        * **IcebergTargets** *(list) --*

          Specifies Apache Iceberg data store targets.

          * *(dict) --*

            Specifies an Apache Iceberg data source where Iceberg
            tables are stored in Amazon S3.

            * **Paths** *(list) --*

              One or more Amazon S3 paths that contains Iceberg
              metadata folders as "s3://bucket/prefix".

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the
              Iceberg target.

            * **Exclusions** *(list) --*

              A list of glob patterns used to exclude from the crawl.
              For more information, see Catalog Tables with a Crawler.

              * *(string) --*

            * **MaximumTraversalDepth** *(integer) --*

              The maximum depth of Amazon S3 paths that the crawler
              can traverse to discover the Iceberg metadata folder in
              your Amazon S3 path. Used to limit the crawler run time.

        * **HudiTargets** *(list) --*

          Specifies Apache Hudi data store targets.

          * *(dict) --*

            Specifies an Apache Hudi data source.

            * **Paths** *(list) --*

              An array of Amazon S3 location strings for Hudi, each
              indicating the root folder with which the metadata files
              for a Hudi table resides. The Hudi folder may be located
              in a child folder of the root folder.

              The crawler will scan all folders underneath a path for
              a Hudi folder.

              * *(string) --*

            * **ConnectionName** *(string) --*

              The name of the connection to use to connect to the Hudi
              target. If your Hudi files are stored in buckets that
              require VPC authorization, you can set their connection
              properties here.

            * **Exclusions** *(list) --*

              A list of glob patterns used to exclude from the crawl.
              For more information, see Catalog Tables with a Crawler.

              * *(string) --*

            * **MaximumTraversalDepth** *(integer) --*

              The maximum depth of Amazon S3 paths that the crawler
              can traverse to discover the Hudi metadata folder in
              your Amazon S3 path. Used to limit the crawler run time.

      * **Schedule** (*string*) -- A "cron" expression used to specify
        the schedule (see Time-Based Schedules for Jobs and Crawlers.
        For example, to run something every day at 12:15 UTC, you
        would specify: "cron(15 12 * * ? *)".

      * **Classifiers** (*list*) --

        A list of custom classifiers that the user has registered. By
        default, all built-in classifiers are included in a crawl, but
        these custom classifiers always override the default
        classifiers for a given classification.

        * *(string) --*

      * **TablePrefix** (*string*) -- The table prefix used for
        catalog tables that are created.

      * **SchemaChangePolicy** (*dict*) --

        The policy for the crawler's update and deletion behavior.

        * **UpdateBehavior** *(string) --*

          The update behavior when the crawler finds a changed schema.

        * **DeleteBehavior** *(string) --*

          The deletion behavior when the crawler finds a deleted
          object.

      * **RecrawlPolicy** (*dict*) --

        A policy that specifies whether to crawl the entire dataset
        again, or to crawl only folders that were added since the last
        crawler run.

        * **RecrawlBehavior** *(string) --*

          Specifies whether to crawl the entire dataset again or to
          crawl only folders that were added since the last crawler
          run.

          A value of "CRAWL_EVERYTHING" specifies crawling the entire
          dataset again.

          A value of "CRAWL_NEW_FOLDERS_ONLY" specifies crawling only
          folders that were added since the last crawler run.

          A value of "CRAWL_EVENT_MODE" specifies crawling only the
          changes identified by Amazon S3 events.

      * **LineageConfiguration** (*dict*) --

        Specifies data lineage configuration settings for the crawler.

        * **CrawlerLineageSettings** *(string) --*

          Specifies whether data lineage is enabled for the crawler.
          Valid values are:

          * ENABLE: enables data lineage for the crawler

          * DISABLE: disables data lineage for the crawler

      * **LakeFormationConfiguration** (*dict*) --

        Specifies Lake Formation configuration settings for the
        crawler.

        * **UseLakeFormationCredentials** *(boolean) --*

          Specifies whether to use Lake Formation credentials for the
          crawler instead of the IAM role credentials.

        * **AccountId** *(string) --*

          Required for cross account crawls. For same account crawls
          as the target data, this can be left as null.

      * **Configuration** (*string*) -- Crawler configuration
        information. This versioned JSON string allows users to
        specify aspects of a crawler's behavior. For more information,
        see Setting crawler configuration options.

      * **CrawlerSecurityConfiguration** (*string*) -- The name of the
        "SecurityConfiguration" structure to be used by this crawler.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.VersionMismatchException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.CrawlerRunningException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_workflow


get_workflow
************

Glue.Client.get_workflow(**kwargs)

   Retrieves resource metadata for a workflow.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_workflow(
          Name='string',
          IncludeGraph=True|False
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the workflow to retrieve.

      * **IncludeGraph** (*boolean*) -- Specifies whether to include a
        graph when returning the workflow resource metadata.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Workflow': {
                 'Name': 'string',
                 'Description': 'string',
                 'DefaultRunProperties': {
                     'string': 'string'
                 },
                 'CreatedOn': datetime(2015, 1, 1),
                 'LastModifiedOn': datetime(2015, 1, 1),
                 'LastRun': {
                     'Name': 'string',
                     'WorkflowRunId': 'string',
                     'PreviousRunId': 'string',
                     'WorkflowRunProperties': {
                         'string': 'string'
                     },
                     'StartedOn': datetime(2015, 1, 1),
                     'CompletedOn': datetime(2015, 1, 1),
                     'Status': 'RUNNING'|'COMPLETED'|'STOPPING'|'STOPPED'|'ERROR',
                     'ErrorMessage': 'string',
                     'Statistics': {
                         'TotalActions': 123,
                         'TimeoutActions': 123,
                         'FailedActions': 123,
                         'StoppedActions': 123,
                         'SucceededActions': 123,
                         'RunningActions': 123,
                         'ErroredActions': 123,
                         'WaitingActions': 123
                     },
                     'Graph': {
                         'Nodes': [
                             {
                                 'Type': 'CRAWLER'|'JOB'|'TRIGGER',
                                 'Name': 'string',
                                 'UniqueId': 'string',
                                 'TriggerDetails': {
                                     'Trigger': {
                                         'Name': 'string',
                                         'WorkflowName': 'string',
                                         'Id': 'string',
                                         'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                                         'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                                         'Description': 'string',
                                         'Schedule': 'string',
                                         'Actions': [
                                             {
                                                 'JobName': 'string',
                                                 'Arguments': {
                                                     'string': 'string'
                                                 },
                                                 'Timeout': 123,
                                                 'SecurityConfiguration': 'string',
                                                 'NotificationProperty': {
                                                     'NotifyDelayAfter': 123
                                                 },
                                                 'CrawlerName': 'string'
                                             },
                                         ],
                                         'Predicate': {
                                             'Logical': 'AND'|'ANY',
                                             'Conditions': [
                                                 {
                                                     'LogicalOperator': 'EQUALS',
                                                     'JobName': 'string',
                                                     'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                                     'CrawlerName': 'string',
                                                     'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                                                 },
                                             ]
                                         },
                                         'EventBatchingCondition': {
                                             'BatchSize': 123,
                                             'BatchWindow': 123
                                         }
                                     }
                                 },
                                 'JobDetails': {
                                     'JobRuns': [
                                         {
                                             'Id': 'string',
                                             'Attempt': 123,
                                             'PreviousRunId': 'string',
                                             'TriggerName': 'string',
                                             'JobName': 'string',
                                             'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                                             'JobRunQueuingEnabled': True|False,
                                             'StartedOn': datetime(2015, 1, 1),
                                             'LastModifiedOn': datetime(2015, 1, 1),
                                             'CompletedOn': datetime(2015, 1, 1),
                                             'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                             'Arguments': {
                                                 'string': 'string'
                                             },
                                             'ErrorMessage': 'string',
                                             'PredecessorRuns': [
                                                 {
                                                     'JobName': 'string',
                                                     'RunId': 'string'
                                                 },
                                             ],
                                             'AllocatedCapacity': 123,
                                             'ExecutionTime': 123,
                                             'Timeout': 123,
                                             'MaxCapacity': 123.0,
                                             'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                                             'NumberOfWorkers': 123,
                                             'SecurityConfiguration': 'string',
                                             'LogGroupName': 'string',
                                             'NotificationProperty': {
                                                 'NotifyDelayAfter': 123
                                             },
                                             'GlueVersion': 'string',
                                             'DPUSeconds': 123.0,
                                             'ExecutionClass': 'FLEX'|'STANDARD',
                                             'MaintenanceWindow': 'string',
                                             'ProfileName': 'string',
                                             'StateDetail': 'string',
                                             'ExecutionRoleSessionPolicy': 'string'
                                         },
                                     ]
                                 },
                                 'CrawlerDetails': {
                                     'Crawls': [
                                         {
                                             'State': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR',
                                             'StartedOn': datetime(2015, 1, 1),
                                             'CompletedOn': datetime(2015, 1, 1),
                                             'ErrorMessage': 'string',
                                             'LogGroup': 'string',
                                             'LogStream': 'string'
                                         },
                                     ]
                                 }
                             },
                         ],
                         'Edges': [
                             {
                                 'SourceId': 'string',
                                 'DestinationId': 'string'
                             },
                         ]
                     },
                     'StartingEventBatchCondition': {
                         'BatchSize': 123,
                         'BatchWindow': 123
                     }
                 },
                 'Graph': {
                     'Nodes': [
                         {
                             'Type': 'CRAWLER'|'JOB'|'TRIGGER',
                             'Name': 'string',
                             'UniqueId': 'string',
                             'TriggerDetails': {
                                 'Trigger': {
                                     'Name': 'string',
                                     'WorkflowName': 'string',
                                     'Id': 'string',
                                     'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                                     'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                                     'Description': 'string',
                                     'Schedule': 'string',
                                     'Actions': [
                                         {
                                             'JobName': 'string',
                                             'Arguments': {
                                                 'string': 'string'
                                             },
                                             'Timeout': 123,
                                             'SecurityConfiguration': 'string',
                                             'NotificationProperty': {
                                                 'NotifyDelayAfter': 123
                                             },
                                             'CrawlerName': 'string'
                                         },
                                     ],
                                     'Predicate': {
                                         'Logical': 'AND'|'ANY',
                                         'Conditions': [
                                             {
                                                 'LogicalOperator': 'EQUALS',
                                                 'JobName': 'string',
                                                 'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                                 'CrawlerName': 'string',
                                                 'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                                             },
                                         ]
                                     },
                                     'EventBatchingCondition': {
                                         'BatchSize': 123,
                                         'BatchWindow': 123
                                     }
                                 }
                             },
                             'JobDetails': {
                                 'JobRuns': [
                                     {
                                         'Id': 'string',
                                         'Attempt': 123,
                                         'PreviousRunId': 'string',
                                         'TriggerName': 'string',
                                         'JobName': 'string',
                                         'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                                         'JobRunQueuingEnabled': True|False,
                                         'StartedOn': datetime(2015, 1, 1),
                                         'LastModifiedOn': datetime(2015, 1, 1),
                                         'CompletedOn': datetime(2015, 1, 1),
                                         'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                         'Arguments': {
                                             'string': 'string'
                                         },
                                         'ErrorMessage': 'string',
                                         'PredecessorRuns': [
                                             {
                                                 'JobName': 'string',
                                                 'RunId': 'string'
                                             },
                                         ],
                                         'AllocatedCapacity': 123,
                                         'ExecutionTime': 123,
                                         'Timeout': 123,
                                         'MaxCapacity': 123.0,
                                         'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                                         'NumberOfWorkers': 123,
                                         'SecurityConfiguration': 'string',
                                         'LogGroupName': 'string',
                                         'NotificationProperty': {
                                             'NotifyDelayAfter': 123
                                         },
                                         'GlueVersion': 'string',
                                         'DPUSeconds': 123.0,
                                         'ExecutionClass': 'FLEX'|'STANDARD',
                                         'MaintenanceWindow': 'string',
                                         'ProfileName': 'string',
                                         'StateDetail': 'string',
                                         'ExecutionRoleSessionPolicy': 'string'
                                     },
                                 ]
                             },
                             'CrawlerDetails': {
                                 'Crawls': [
                                     {
                                         'State': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR',
                                         'StartedOn': datetime(2015, 1, 1),
                                         'CompletedOn': datetime(2015, 1, 1),
                                         'ErrorMessage': 'string',
                                         'LogGroup': 'string',
                                         'LogStream': 'string'
                                     },
                                 ]
                             }
                         },
                     ],
                     'Edges': [
                         {
                             'SourceId': 'string',
                             'DestinationId': 'string'
                         },
                     ]
                 },
                 'MaxConcurrentRuns': 123,
                 'BlueprintDetails': {
                     'BlueprintName': 'string',
                     'RunId': 'string'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Workflow** *(dict) --*

          The resource metadata for the workflow.

          * **Name** *(string) --*

            The name of the workflow.

          * **Description** *(string) --*

            A description of the workflow.

          * **DefaultRunProperties** *(dict) --*

            A collection of properties to be used as part of each
            execution of the workflow. The run properties are made
            available to each job in the workflow. A job can modify
            the properties for the next jobs in the flow.

            * *(string) --*

              * *(string) --*

          * **CreatedOn** *(datetime) --*

            The date and time when the workflow was created.

          * **LastModifiedOn** *(datetime) --*

            The date and time when the workflow was last modified.

          * **LastRun** *(dict) --*

            The information about the last execution of the workflow.

            * **Name** *(string) --*

              Name of the workflow that was run.

            * **WorkflowRunId** *(string) --*

              The ID of this workflow run.

            * **PreviousRunId** *(string) --*

              The ID of the previous workflow run.

            * **WorkflowRunProperties** *(dict) --*

              The workflow run properties which were set during the
              run.

              * *(string) --*

                * *(string) --*

            * **StartedOn** *(datetime) --*

              The date and time when the workflow run was started.

            * **CompletedOn** *(datetime) --*

              The date and time when the workflow run completed.

            * **Status** *(string) --*

              The status of the workflow run.

            * **ErrorMessage** *(string) --*

              This error message describes any error that may have
              occurred in starting the workflow run. Currently the
              only error message is "Concurrent runs exceeded for
              workflow: "foo"."

            * **Statistics** *(dict) --*

              The statistics of the run.

              * **TotalActions** *(integer) --*

                Total number of Actions in the workflow run.

              * **TimeoutActions** *(integer) --*

                Total number of Actions that timed out.

              * **FailedActions** *(integer) --*

                Total number of Actions that have failed.

              * **StoppedActions** *(integer) --*

                Total number of Actions that have stopped.

              * **SucceededActions** *(integer) --*

                Total number of Actions that have succeeded.

              * **RunningActions** *(integer) --*

                Total number Actions in running state.

              * **ErroredActions** *(integer) --*

                Indicates the count of job runs in the ERROR state in
                the workflow run.

              * **WaitingActions** *(integer) --*

                Indicates the count of job runs in WAITING state in
                the workflow run.

            * **Graph** *(dict) --*

              The graph representing all the Glue components that
              belong to the workflow as nodes and directed connections
              between them as edges.

              * **Nodes** *(list) --*

                A list of the the Glue components belong to the
                workflow represented as nodes.

                * *(dict) --*

                  A node represents an Glue component (trigger,
                  crawler, or job) on a workflow graph.

                  * **Type** *(string) --*

                    The type of Glue component represented by the
                    node.

                  * **Name** *(string) --*

                    The name of the Glue component represented by the
                    node.

                  * **UniqueId** *(string) --*

                    The unique Id assigned to the node within the
                    workflow.

                  * **TriggerDetails** *(dict) --*

                    Details of the Trigger when the node represents a
                    Trigger.

                    * **Trigger** *(dict) --*

                      The information of the trigger represented by
                      the trigger node.

                      * **Name** *(string) --*

                        The name of the trigger.

                      * **WorkflowName** *(string) --*

                        The name of the workflow associated with the
                        trigger.

                      * **Id** *(string) --*

                        Reserved for future use.

                      * **Type** *(string) --*

                        The type of trigger that this is.

                      * **State** *(string) --*

                        The current state of the trigger.

                      * **Description** *(string) --*

                        A description of this trigger.

                      * **Schedule** *(string) --*

                        A "cron" expression used to specify the
                        schedule (see Time-Based Schedules for Jobs
                        and Crawlers. For example, to run something
                        every day at 12:15 UTC, you would specify:
                        "cron(15 12 * * ? *)".

                      * **Actions** *(list) --*

                        The actions initiated by this trigger.

                        * *(dict) --*

                          Defines an action to be initiated by a
                          trigger.

                          * **JobName** *(string) --*

                            The name of a job to be run.

                          * **Arguments** *(dict) --*

                            The job arguments used when this trigger
                            fires. For this job run, they replace the
                            default arguments set in the job
                            definition itself.

                            You can specify arguments here that your
                            own job-execution script consumes, as well
                            as arguments that Glue itself consumes.

                            For information about how to specify and
                            consume your own Job arguments, see the
                            Calling Glue APIs in Python topic in the
                            developer guide.

                            For information about the key-value pairs
                            that Glue consumes to set up your job, see
                            the Special Parameters Used by Glue topic
                            in the developer guide.

                            * *(string) --*

                              * *(string) --*

                          * **Timeout** *(integer) --*

                            The "JobRun" timeout in minutes. This is
                            the maximum time that a job run can
                            consume resources before it is terminated
                            and enters "TIMEOUT" status. This
                            overrides the timeout value set in the
                            parent job.

                            Jobs must have timeout values less than 7
                            days or 10080 minutes. Otherwise, the jobs
                            will throw an exception.

                            When the value is left blank, the timeout
                            is defaulted to 2880 minutes.

                            Any existing Glue jobs that had a timeout
                            value greater than 7 days will be
                            defaulted to 7 days. For instance if you
                            have specified a timeout of 20 days for a
                            batch job, it will be stopped on the 7th
                            day.

                            For streaming jobs, if you have set up a
                            maintenance window, it will be restarted
                            during the maintenance window after 7
                            days.

                          * **SecurityConfiguration** *(string) --*

                            The name of the "SecurityConfiguration"
                            structure to be used with this action.

                          * **NotificationProperty** *(dict) --*

                            Specifies configuration properties of a
                            job run notification.

                            * **NotifyDelayAfter** *(integer) --*

                              After a job run starts, the number of
                              minutes to wait before sending a job run
                              delay notification.

                          * **CrawlerName** *(string) --*

                            The name of the crawler to be used with
                            this action.

                      * **Predicate** *(dict) --*

                        The predicate of this trigger, which defines
                        when it will fire.

                        * **Logical** *(string) --*

                          An optional field if only one condition is
                          listed. If multiple conditions are listed,
                          then this field is required.

                        * **Conditions** *(list) --*

                          A list of the conditions that determine when
                          the trigger will fire.

                          * *(dict) --*

                            Defines a condition under which a trigger
                            fires.

                            * **LogicalOperator** *(string) --*

                              A logical operator.

                            * **JobName** *(string) --*

                              The name of the job whose "JobRuns" this
                              condition applies to, and on which this
                              trigger waits.

                            * **State** *(string) --*

                              The condition state. Currently, the only
                              job states that a trigger can listen for
                              are "SUCCEEDED", "STOPPED", "FAILED",
                              and "TIMEOUT". The only crawler states
                              that a trigger can listen for are
                              "SUCCEEDED", "FAILED", and "CANCELLED".

                            * **CrawlerName** *(string) --*

                              The name of the crawler to which this
                              condition applies.

                            * **CrawlState** *(string) --*

                              The state of the crawler to which this
                              condition applies.

                      * **EventBatchingCondition** *(dict) --*

                        Batch condition that must be met (specified
                        number of events received or batch time window
                        expired) before EventBridge event trigger
                        fires.

                        * **BatchSize** *(integer) --*

                          Number of events that must be received from
                          Amazon EventBridge before EventBridge event
                          trigger fires.

                        * **BatchWindow** *(integer) --*

                          Window of time in seconds after which
                          EventBridge event trigger fires. Window
                          starts when first event is received.

                  * **JobDetails** *(dict) --*

                    Details of the Job when the node represents a Job.

                    * **JobRuns** *(list) --*

                      The information for the job runs represented by
                      the job node.

                      * *(dict) --*

                        Contains information about a job run.

                        * **Id** *(string) --*

                          The ID of this job run.

                        * **Attempt** *(integer) --*

                          The number of the attempt to run this job.

                        * **PreviousRunId** *(string) --*

                          The ID of the previous run of this job. For
                          example, the "JobRunId" specified in the
                          "StartJobRun" action.

                        * **TriggerName** *(string) --*

                          The name of the trigger that started this
                          job run.

                        * **JobName** *(string) --*

                          The name of the job definition being used in
                          this run.

                        * **JobMode** *(string) --*

                          A mode that describes how a job was created.
                          Valid values are:

                          * "SCRIPT" - The job was created using the
                            Glue Studio script editor.

                          * "VISUAL" - The job was created using the
                            Glue Studio visual editor.

                          * "NOTEBOOK" - The job was created using an
                            interactive sessions notebook.

                          When the "JobMode" field is missing or null,
                          "SCRIPT" is assigned as the default value.

                        * **JobRunQueuingEnabled** *(boolean) --*

                          Specifies whether job run queuing is enabled
                          for the job run.

                          A value of true means job run queuing is
                          enabled for the job run. If false or not
                          populated, the job run will not be
                          considered for queueing.

                        * **StartedOn** *(datetime) --*

                          The date and time at which this job run was
                          started.

                        * **LastModifiedOn** *(datetime) --*

                          The last time that this job run was
                          modified.

                        * **CompletedOn** *(datetime) --*

                          The date and time that this job run
                          completed.

                        * **JobRunState** *(string) --*

                          The current state of the job run. For more
                          information about the statuses of jobs that
                          have terminated abnormally, see Glue Job Run
                          Statuses.

                        * **Arguments** *(dict) --*

                          The job arguments associated with this run.
                          For this job run, they replace the default
                          arguments set in the job definition itself.

                          You can specify arguments here that your own
                          job-execution script consumes, as well as
                          arguments that Glue itself consumes.

                          Job arguments may be logged. Do not pass
                          plaintext secrets as arguments. Retrieve
                          secrets from a Glue Connection, Secrets
                          Manager or other secret management mechanism
                          if you intend to keep them within the Job.

                          For information about how to specify and
                          consume your own Job arguments, see the
                          Calling Glue APIs in Python topic in the
                          developer guide.

                          For information about the arguments you can
                          provide to this field when configuring Spark
                          jobs, see the Special Parameters Used by
                          Glue topic in the developer guide.

                          For information about the arguments you can
                          provide to this field when configuring Ray
                          jobs, see Using job parameters in Ray jobs
                          in the developer guide.

                          * *(string) --*

                            * *(string) --*

                        * **ErrorMessage** *(string) --*

                          An error message associated with this job
                          run.

                        * **PredecessorRuns** *(list) --*

                          A list of predecessors to this job run.

                          * *(dict) --*

                            A job run that was used in the predicate
                            of a conditional trigger that triggered
                            this job run.

                            * **JobName** *(string) --*

                              The name of the job definition used by
                              the predecessor job run.

                            * **RunId** *(string) --*

                              The job-run ID of the predecessor job
                              run.

                        * **AllocatedCapacity** *(integer) --*

                          This field is deprecated. Use "MaxCapacity"
                          instead.

                          The number of Glue data processing units
                          (DPUs) allocated to this JobRun. From 2 to
                          100 DPUs can be allocated; the default is
                          10. A DPU is a relative measure of
                          processing power that consists of 4 vCPUs of
                          compute capacity and 16 GB of memory. For
                          more information, see the Glue pricing page.

                        * **ExecutionTime** *(integer) --*

                          The amount of time (in seconds) that the job
                          run consumed resources.

                        * **Timeout** *(integer) --*

                          The "JobRun" timeout in minutes. This is the
                          maximum time that a job run can consume
                          resources before it is terminated and enters
                          "TIMEOUT" status. This value overrides the
                          timeout value set in the parent job.

                          Jobs must have timeout values less than 7
                          days or 10080 minutes. Otherwise, the jobs
                          will throw an exception.

                          When the value is left blank, the timeout is
                          defaulted to 2880 minutes.

                          Any existing Glue jobs that had a timeout
                          value greater than 7 days will be defaulted
                          to 7 days. For instance if you have
                          specified a timeout of 20 days for a batch
                          job, it will be stopped on the 7th day.

                          For streaming jobs, if you have set up a
                          maintenance window, it will be restarted
                          during the maintenance window after 7 days.

                        * **MaxCapacity** *(float) --*

                          For Glue version 1.0 or earlier jobs, using
                          the standard worker type, the number of Glue
                          data processing units (DPUs) that can be
                          allocated when this job runs. A DPU is a
                          relative measure of processing power that
                          consists of 4 vCPUs of compute capacity and
                          16 GB of memory. For more information, see
                          the Glue pricing page.

                          For Glue version 2.0+ jobs, you cannot
                          specify a "Maximum capacity". Instead, you
                          should specify a "Worker type" and the
                          "Number of workers".

                          Do not set "MaxCapacity" if using
                          "WorkerType" and "NumberOfWorkers".

                          The value that can be allocated for
                          "MaxCapacity" depends on whether you are
                          running a Python shell job, an Apache Spark
                          ETL job, or an Apache Spark streaming ETL
                          job:

                          * When you specify a Python shell job (
                            >>``<<JobCommand.Name``="pythonshell"),
                            you can allocate either 0.0625 or 1 DPU.
                            The default is 0.0625 DPU.

                          * When you specify an Apache Spark ETL job (
                            >>``<<JobCommand.Name``="glueetl") or
                            Apache Spark streaming ETL job (
                            >>``<<JobCommand.Name``="gluestreaming"),
                            you can allocate from 2 to 100 DPUs. The
                            default is 10 DPUs. This job type cannot
                            have a fractional DPU allocation.

                        * **WorkerType** *(string) --*

                          The type of predefined worker that is
                          allocated when a job runs. Accepts a value
                          of G.1X, G.2X, G.4X, G.8X or G.025X for
                          Spark jobs. Accepts the value Z.2X for Ray
                          jobs.

                          * For the "G.1X" worker type, each worker
                            maps to 1 DPU (4 vCPUs, 16 GB of memory)
                            with 94GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for workloads such as data transforms,
                            joins, and queries, to offers a scalable
                            and cost effective way to run most jobs.

                          * For the "G.2X" worker type, each worker
                            maps to 2 DPU (8 vCPUs, 32 GB of memory)
                            with 138GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for workloads such as data transforms,
                            joins, and queries, to offers a scalable
                            and cost effective way to run most jobs.

                          * For the "G.4X" worker type, each worker
                            maps to 4 DPU (16 vCPUs, 64 GB of memory)
                            with 256GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for jobs whose workloads contain your most
                            demanding transforms, aggregations, joins,
                            and queries. This worker type is available
                            only for Glue version 3.0 or later Spark
                            ETL jobs in the following Amazon Web
                            Services Regions: US East (Ohio), US East
                            (N. Virginia), US West (Oregon), Asia
                            Pacific (Singapore), Asia Pacific
                            (Sydney), Asia Pacific (Tokyo), Canada
                            (Central), Europe (Frankfurt), Europe
                            (Ireland), and Europe (Stockholm).

                          * For the "G.8X" worker type, each worker
                            maps to 8 DPU (32 vCPUs, 128 GB of memory)
                            with 512GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for jobs whose workloads contain your most
                            demanding transforms, aggregations, joins,
                            and queries. This worker type is available
                            only for Glue version 3.0 or later Spark
                            ETL jobs, in the same Amazon Web Services
                            Regions as supported for the "G.4X" worker
                            type.

                          * For the "G.025X" worker type, each worker
                            maps to 0.25 DPU (2 vCPUs, 4 GB of memory)
                            with 84GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for low volume streaming jobs. This worker
                            type is only available for Glue version
                            3.0 or later streaming jobs.

                          * For the "Z.2X" worker type, each worker
                            maps to 2 M-DPU (8vCPUs, 64 GB of memory)
                            with 128 GB disk, and provides up to 8 Ray
                            workers based on the autoscaler.

                        * **NumberOfWorkers** *(integer) --*

                          The number of workers of a defined
                          "workerType" that are allocated when a job
                          runs.

                        * **SecurityConfiguration** *(string) --*

                          The name of the "SecurityConfiguration"
                          structure to be used with this job run.

                        * **LogGroupName** *(string) --*

                          The name of the log group for secure logging
                          that can be server-side encrypted in Amazon
                          CloudWatch using KMS. This name can be
                          "/aws-glue/jobs/", in which case the default
                          encryption is "NONE". If you add a role name
                          and "SecurityConfiguration" name (in other
                          words, "/aws-glue/jobs-yourRoleName-
                          yourSecurityConfigurationName/"), then that
                          security configuration is used to encrypt
                          the log group.

                        * **NotificationProperty** *(dict) --*

                          Specifies configuration properties of a job
                          run notification.

                          * **NotifyDelayAfter** *(integer) --*

                            After a job run starts, the number of
                            minutes to wait before sending a job run
                            delay notification.

                        * **GlueVersion** *(string) --*

                          In Spark jobs, "GlueVersion" determines the
                          versions of Apache Spark and Python that
                          Glue available in a job. The Python version
                          indicates the version supported for jobs of
                          type Spark.

                          Ray jobs should set "GlueVersion" to "4.0"
                          or greater. However, the versions of Ray,
                          Python and additional libraries available in
                          your Ray job are determined by the "Runtime"
                          parameter of the Job command.

                          For more information about the available
                          Glue versions and corresponding Spark and
                          Python versions, see Glue version in the
                          developer guide.

                          Jobs that are created without specifying a
                          Glue version default to Glue 0.9.

                        * **DPUSeconds** *(float) --*

                          This field can be set for either job runs
                          with execution class "FLEX" or when Auto
                          Scaling is enabled, and represents the total
                          time each executor ran during the lifecycle
                          of a job run in seconds, multiplied by a DPU
                          factor (1 for "G.1X", 2 for "G.2X", or 0.25
                          for "G.025X" workers). This value may be
                          different than the "executionEngineRuntime"
                          * "MaxCapacity" as in the case of Auto
                          Scaling jobs, as the number of executors
                          running at a given time may be less than the
                          "MaxCapacity". Therefore, it is possible
                          that the value of "DPUSeconds" is less than
                          "executionEngineRuntime" * "MaxCapacity".

                        * **ExecutionClass** *(string) --*

                          Indicates whether the job is run with a
                          standard or flexible execution class. The
                          standard execution-class is ideal for time-
                          sensitive workloads that require fast job
                          startup and dedicated resources.

                          The flexible execution class is appropriate
                          for time-insensitive jobs whose start and
                          completion times may vary.

                          Only jobs with Glue version 3.0 and above
                          and command type "glueetl" will be allowed
                          to set "ExecutionClass" to "FLEX". The
                          flexible execution class is available for
                          Spark jobs.

                        * **MaintenanceWindow** *(string) --*

                          This field specifies a day of the week and
                          hour for a maintenance window for streaming
                          jobs. Glue periodically performs maintenance
                          activities. During these maintenance
                          windows, Glue will need to restart your
                          streaming jobs.

                          Glue will restart the job within 3 hours of
                          the specified maintenance window. For
                          instance, if you set up the maintenance
                          window for Monday at 10:00AM GMT, your jobs
                          will be restarted between 10:00AM GMT to
                          1:00PM GMT.

                        * **ProfileName** *(string) --*

                          The name of an Glue usage profile associated
                          with the job run.

                        * **StateDetail** *(string) --*

                          This field holds details that pertain to the
                          state of a job run. The field is nullable.

                          For example, when a job run is in a WAITING
                          state as a result of job run queuing, the
                          field has the reason why the job run is in
                          that state.

                        * **ExecutionRoleSessionPolicy** *(string) --*

                          This inline session policy to the
                          StartJobRun API allows you to dynamically
                          restrict the permissions of the specified
                          execution role for the scope of the job,
                          without requiring the creation of additional
                          IAM roles.

                  * **CrawlerDetails** *(dict) --*

                    Details of the crawler when the node represents a
                    crawler.

                    * **Crawls** *(list) --*

                      A list of crawls represented by the crawl node.

                      * *(dict) --*

                        The details of a crawl in the workflow.

                        * **State** *(string) --*

                          The state of the crawler.

                        * **StartedOn** *(datetime) --*

                          The date and time on which the crawl
                          started.

                        * **CompletedOn** *(datetime) --*

                          The date and time on which the crawl
                          completed.

                        * **ErrorMessage** *(string) --*

                          The error message associated with the crawl.

                        * **LogGroup** *(string) --*

                          The log group associated with the crawl.

                        * **LogStream** *(string) --*

                          The log stream associated with the crawl.

              * **Edges** *(list) --*

                A list of all the directed connections between the
                nodes belonging to the workflow.

                * *(dict) --*

                  An edge represents a directed connection between two
                  Glue components that are part of the workflow the
                  edge belongs to.

                  * **SourceId** *(string) --*

                    The unique of the node within the workflow where
                    the edge starts.

                  * **DestinationId** *(string) --*

                    The unique of the node within the workflow where
                    the edge ends.

            * **StartingEventBatchCondition** *(dict) --*

              The batch condition that started the workflow run.

              * **BatchSize** *(integer) --*

                Number of events in the batch.

              * **BatchWindow** *(integer) --*

                Duration of the batch window in seconds.

          * **Graph** *(dict) --*

            The graph representing all the Glue components that belong
            to the workflow as nodes and directed connections between
            them as edges.

            * **Nodes** *(list) --*

              A list of the the Glue components belong to the workflow
              represented as nodes.

              * *(dict) --*

                A node represents an Glue component (trigger, crawler,
                or job) on a workflow graph.

                * **Type** *(string) --*

                  The type of Glue component represented by the node.

                * **Name** *(string) --*

                  The name of the Glue component represented by the
                  node.

                * **UniqueId** *(string) --*

                  The unique Id assigned to the node within the
                  workflow.

                * **TriggerDetails** *(dict) --*

                  Details of the Trigger when the node represents a
                  Trigger.

                  * **Trigger** *(dict) --*

                    The information of the trigger represented by the
                    trigger node.

                    * **Name** *(string) --*

                      The name of the trigger.

                    * **WorkflowName** *(string) --*

                      The name of the workflow associated with the
                      trigger.

                    * **Id** *(string) --*

                      Reserved for future use.

                    * **Type** *(string) --*

                      The type of trigger that this is.

                    * **State** *(string) --*

                      The current state of the trigger.

                    * **Description** *(string) --*

                      A description of this trigger.

                    * **Schedule** *(string) --*

                      A "cron" expression used to specify the schedule
                      (see Time-Based Schedules for Jobs and Crawlers.
                      For example, to run something every day at 12:15
                      UTC, you would specify: "cron(15 12 * * ? *)".

                    * **Actions** *(list) --*

                      The actions initiated by this trigger.

                      * *(dict) --*

                        Defines an action to be initiated by a
                        trigger.

                        * **JobName** *(string) --*

                          The name of a job to be run.

                        * **Arguments** *(dict) --*

                          The job arguments used when this trigger
                          fires. For this job run, they replace the
                          default arguments set in the job definition
                          itself.

                          You can specify arguments here that your own
                          job-execution script consumes, as well as
                          arguments that Glue itself consumes.

                          For information about how to specify and
                          consume your own Job arguments, see the
                          Calling Glue APIs in Python topic in the
                          developer guide.

                          For information about the key-value pairs
                          that Glue consumes to set up your job, see
                          the Special Parameters Used by Glue topic in
                          the developer guide.

                          * *(string) --*

                            * *(string) --*

                        * **Timeout** *(integer) --*

                          The "JobRun" timeout in minutes. This is the
                          maximum time that a job run can consume
                          resources before it is terminated and enters
                          "TIMEOUT" status. This overrides the timeout
                          value set in the parent job.

                          Jobs must have timeout values less than 7
                          days or 10080 minutes. Otherwise, the jobs
                          will throw an exception.

                          When the value is left blank, the timeout is
                          defaulted to 2880 minutes.

                          Any existing Glue jobs that had a timeout
                          value greater than 7 days will be defaulted
                          to 7 days. For instance if you have
                          specified a timeout of 20 days for a batch
                          job, it will be stopped on the 7th day.

                          For streaming jobs, if you have set up a
                          maintenance window, it will be restarted
                          during the maintenance window after 7 days.

                        * **SecurityConfiguration** *(string) --*

                          The name of the "SecurityConfiguration"
                          structure to be used with this action.

                        * **NotificationProperty** *(dict) --*

                          Specifies configuration properties of a job
                          run notification.

                          * **NotifyDelayAfter** *(integer) --*

                            After a job run starts, the number of
                            minutes to wait before sending a job run
                            delay notification.

                        * **CrawlerName** *(string) --*

                          The name of the crawler to be used with this
                          action.

                    * **Predicate** *(dict) --*

                      The predicate of this trigger, which defines
                      when it will fire.

                      * **Logical** *(string) --*

                        An optional field if only one condition is
                        listed. If multiple conditions are listed,
                        then this field is required.

                      * **Conditions** *(list) --*

                        A list of the conditions that determine when
                        the trigger will fire.

                        * *(dict) --*

                          Defines a condition under which a trigger
                          fires.

                          * **LogicalOperator** *(string) --*

                            A logical operator.

                          * **JobName** *(string) --*

                            The name of the job whose "JobRuns" this
                            condition applies to, and on which this
                            trigger waits.

                          * **State** *(string) --*

                            The condition state. Currently, the only
                            job states that a trigger can listen for
                            are "SUCCEEDED", "STOPPED", "FAILED", and
                            "TIMEOUT". The only crawler states that a
                            trigger can listen for are "SUCCEEDED",
                            "FAILED", and "CANCELLED".

                          * **CrawlerName** *(string) --*

                            The name of the crawler to which this
                            condition applies.

                          * **CrawlState** *(string) --*

                            The state of the crawler to which this
                            condition applies.

                    * **EventBatchingCondition** *(dict) --*

                      Batch condition that must be met (specified
                      number of events received or batch time window
                      expired) before EventBridge event trigger fires.

                      * **BatchSize** *(integer) --*

                        Number of events that must be received from
                        Amazon EventBridge before EventBridge event
                        trigger fires.

                      * **BatchWindow** *(integer) --*

                        Window of time in seconds after which
                        EventBridge event trigger fires. Window starts
                        when first event is received.

                * **JobDetails** *(dict) --*

                  Details of the Job when the node represents a Job.

                  * **JobRuns** *(list) --*

                    The information for the job runs represented by
                    the job node.

                    * *(dict) --*

                      Contains information about a job run.

                      * **Id** *(string) --*

                        The ID of this job run.

                      * **Attempt** *(integer) --*

                        The number of the attempt to run this job.

                      * **PreviousRunId** *(string) --*

                        The ID of the previous run of this job. For
                        example, the "JobRunId" specified in the
                        "StartJobRun" action.

                      * **TriggerName** *(string) --*

                        The name of the trigger that started this job
                        run.

                      * **JobName** *(string) --*

                        The name of the job definition being used in
                        this run.

                      * **JobMode** *(string) --*

                        A mode that describes how a job was created.
                        Valid values are:

                        * "SCRIPT" - The job was created using the
                          Glue Studio script editor.

                        * "VISUAL" - The job was created using the
                          Glue Studio visual editor.

                        * "NOTEBOOK" - The job was created using an
                          interactive sessions notebook.

                        When the "JobMode" field is missing or null,
                        "SCRIPT" is assigned as the default value.

                      * **JobRunQueuingEnabled** *(boolean) --*

                        Specifies whether job run queuing is enabled
                        for the job run.

                        A value of true means job run queuing is
                        enabled for the job run. If false or not
                        populated, the job run will not be considered
                        for queueing.

                      * **StartedOn** *(datetime) --*

                        The date and time at which this job run was
                        started.

                      * **LastModifiedOn** *(datetime) --*

                        The last time that this job run was modified.

                      * **CompletedOn** *(datetime) --*

                        The date and time that this job run completed.

                      * **JobRunState** *(string) --*

                        The current state of the job run. For more
                        information about the statuses of jobs that
                        have terminated abnormally, see Glue Job Run
                        Statuses.

                      * **Arguments** *(dict) --*

                        The job arguments associated with this run.
                        For this job run, they replace the default
                        arguments set in the job definition itself.

                        You can specify arguments here that your own
                        job-execution script consumes, as well as
                        arguments that Glue itself consumes.

                        Job arguments may be logged. Do not pass
                        plaintext secrets as arguments. Retrieve
                        secrets from a Glue Connection, Secrets
                        Manager or other secret management mechanism
                        if you intend to keep them within the Job.

                        For information about how to specify and
                        consume your own Job arguments, see the
                        Calling Glue APIs in Python topic in the
                        developer guide.

                        For information about the arguments you can
                        provide to this field when configuring Spark
                        jobs, see the Special Parameters Used by Glue
                        topic in the developer guide.

                        For information about the arguments you can
                        provide to this field when configuring Ray
                        jobs, see Using job parameters in Ray jobs in
                        the developer guide.

                        * *(string) --*

                          * *(string) --*

                      * **ErrorMessage** *(string) --*

                        An error message associated with this job run.

                      * **PredecessorRuns** *(list) --*

                        A list of predecessors to this job run.

                        * *(dict) --*

                          A job run that was used in the predicate of
                          a conditional trigger that triggered this
                          job run.

                          * **JobName** *(string) --*

                            The name of the job definition used by the
                            predecessor job run.

                          * **RunId** *(string) --*

                            The job-run ID of the predecessor job run.

                      * **AllocatedCapacity** *(integer) --*

                        This field is deprecated. Use "MaxCapacity"
                        instead.

                        The number of Glue data processing units
                        (DPUs) allocated to this JobRun. From 2 to 100
                        DPUs can be allocated; the default is 10. A
                        DPU is a relative measure of processing power
                        that consists of 4 vCPUs of compute capacity
                        and 16 GB of memory. For more information, see
                        the Glue pricing page.

                      * **ExecutionTime** *(integer) --*

                        The amount of time (in seconds) that the job
                        run consumed resources.

                      * **Timeout** *(integer) --*

                        The "JobRun" timeout in minutes. This is the
                        maximum time that a job run can consume
                        resources before it is terminated and enters
                        "TIMEOUT" status. This value overrides the
                        timeout value set in the parent job.

                        Jobs must have timeout values less than 7 days
                        or 10080 minutes. Otherwise, the jobs will
                        throw an exception.

                        When the value is left blank, the timeout is
                        defaulted to 2880 minutes.

                        Any existing Glue jobs that had a timeout
                        value greater than 7 days will be defaulted to
                        7 days. For instance if you have specified a
                        timeout of 20 days for a batch job, it will be
                        stopped on the 7th day.

                        For streaming jobs, if you have set up a
                        maintenance window, it will be restarted
                        during the maintenance window after 7 days.

                      * **MaxCapacity** *(float) --*

                        For Glue version 1.0 or earlier jobs, using
                        the standard worker type, the number of Glue
                        data processing units (DPUs) that can be
                        allocated when this job runs. A DPU is a
                        relative measure of processing power that
                        consists of 4 vCPUs of compute capacity and 16
                        GB of memory. For more information, see the
                        Glue pricing page.

                        For Glue version 2.0+ jobs, you cannot specify
                        a "Maximum capacity". Instead, you should
                        specify a "Worker type" and the "Number of
                        workers".

                        Do not set "MaxCapacity" if using "WorkerType"
                        and "NumberOfWorkers".

                        The value that can be allocated for
                        "MaxCapacity" depends on whether you are
                        running a Python shell job, an Apache Spark
                        ETL job, or an Apache Spark streaming ETL job:

                        * When you specify a Python shell job (
                          >>``<<JobCommand.Name``="pythonshell"), you
                          can allocate either 0.0625 or 1 DPU. The
                          default is 0.0625 DPU.

                        * When you specify an Apache Spark ETL job (
                          >>``<<JobCommand.Name``="glueetl") or Apache
                          Spark streaming ETL job (
                          >>``<<JobCommand.Name``="gluestreaming"),
                          you can allocate from 2 to 100 DPUs. The
                          default is 10 DPUs. This job type cannot
                          have a fractional DPU allocation.

                      * **WorkerType** *(string) --*

                        The type of predefined worker that is
                        allocated when a job runs. Accepts a value of
                        G.1X, G.2X, G.4X, G.8X or G.025X for Spark
                        jobs. Accepts the value Z.2X for Ray jobs.

                        * For the "G.1X" worker type, each worker maps
                          to 1 DPU (4 vCPUs, 16 GB of memory) with
                          94GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          workloads such as data transforms, joins,
                          and queries, to offers a scalable and cost
                          effective way to run most jobs.

                        * For the "G.2X" worker type, each worker maps
                          to 2 DPU (8 vCPUs, 32 GB of memory) with
                          138GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          workloads such as data transforms, joins,
                          and queries, to offers a scalable and cost
                          effective way to run most jobs.

                        * For the "G.4X" worker type, each worker maps
                          to 4 DPU (16 vCPUs, 64 GB of memory) with
                          256GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          jobs whose workloads contain your most
                          demanding transforms, aggregations, joins,
                          and queries. This worker type is available
                          only for Glue version 3.0 or later Spark ETL
                          jobs in the following Amazon Web Services
                          Regions: US East (Ohio), US East (N.
                          Virginia), US West (Oregon), Asia Pacific
                          (Singapore), Asia Pacific (Sydney), Asia
                          Pacific (Tokyo), Canada (Central), Europe
                          (Frankfurt), Europe (Ireland), and Europe
                          (Stockholm).

                        * For the "G.8X" worker type, each worker maps
                          to 8 DPU (32 vCPUs, 128 GB of memory) with
                          512GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          jobs whose workloads contain your most
                          demanding transforms, aggregations, joins,
                          and queries. This worker type is available
                          only for Glue version 3.0 or later Spark ETL
                          jobs, in the same Amazon Web Services
                          Regions as supported for the "G.4X" worker
                          type.

                        * For the "G.025X" worker type, each worker
                          maps to 0.25 DPU (2 vCPUs, 4 GB of memory)
                          with 84GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          low volume streaming jobs. This worker type
                          is only available for Glue version 3.0 or
                          later streaming jobs.

                        * For the "Z.2X" worker type, each worker maps
                          to 2 M-DPU (8vCPUs, 64 GB of memory) with
                          128 GB disk, and provides up to 8 Ray
                          workers based on the autoscaler.

                      * **NumberOfWorkers** *(integer) --*

                        The number of workers of a defined
                        "workerType" that are allocated when a job
                        runs.

                      * **SecurityConfiguration** *(string) --*

                        The name of the "SecurityConfiguration"
                        structure to be used with this job run.

                      * **LogGroupName** *(string) --*

                        The name of the log group for secure logging
                        that can be server-side encrypted in Amazon
                        CloudWatch using KMS. This name can be "/aws-
                        glue/jobs/", in which case the default
                        encryption is "NONE". If you add a role name
                        and "SecurityConfiguration" name (in other
                        words, "/aws-glue/jobs-yourRoleName-
                        yourSecurityConfigurationName/"), then that
                        security configuration is used to encrypt the
                        log group.

                      * **NotificationProperty** *(dict) --*

                        Specifies configuration properties of a job
                        run notification.

                        * **NotifyDelayAfter** *(integer) --*

                          After a job run starts, the number of
                          minutes to wait before sending a job run
                          delay notification.

                      * **GlueVersion** *(string) --*

                        In Spark jobs, "GlueVersion" determines the
                        versions of Apache Spark and Python that Glue
                        available in a job. The Python version
                        indicates the version supported for jobs of
                        type Spark.

                        Ray jobs should set "GlueVersion" to "4.0" or
                        greater. However, the versions of Ray, Python
                        and additional libraries available in your Ray
                        job are determined by the "Runtime" parameter
                        of the Job command.

                        For more information about the available Glue
                        versions and corresponding Spark and Python
                        versions, see Glue version in the developer
                        guide.

                        Jobs that are created without specifying a
                        Glue version default to Glue 0.9.

                      * **DPUSeconds** *(float) --*

                        This field can be set for either job runs with
                        execution class "FLEX" or when Auto Scaling is
                        enabled, and represents the total time each
                        executor ran during the lifecycle of a job run
                        in seconds, multiplied by a DPU factor (1 for
                        "G.1X", 2 for "G.2X", or 0.25 for "G.025X"
                        workers). This value may be different than the
                        "executionEngineRuntime" * "MaxCapacity" as in
                        the case of Auto Scaling jobs, as the number
                        of executors running at a given time may be
                        less than the "MaxCapacity". Therefore, it is
                        possible that the value of "DPUSeconds" is
                        less than "executionEngineRuntime" *
                        "MaxCapacity".

                      * **ExecutionClass** *(string) --*

                        Indicates whether the job is run with a
                        standard or flexible execution class. The
                        standard execution-class is ideal for time-
                        sensitive workloads that require fast job
                        startup and dedicated resources.

                        The flexible execution class is appropriate
                        for time-insensitive jobs whose start and
                        completion times may vary.

                        Only jobs with Glue version 3.0 and above and
                        command type "glueetl" will be allowed to set
                        "ExecutionClass" to "FLEX". The flexible
                        execution class is available for Spark jobs.

                      * **MaintenanceWindow** *(string) --*

                        This field specifies a day of the week and
                        hour for a maintenance window for streaming
                        jobs. Glue periodically performs maintenance
                        activities. During these maintenance windows,
                        Glue will need to restart your streaming jobs.

                        Glue will restart the job within 3 hours of
                        the specified maintenance window. For
                        instance, if you set up the maintenance window
                        for Monday at 10:00AM GMT, your jobs will be
                        restarted between 10:00AM GMT to 1:00PM GMT.

                      * **ProfileName** *(string) --*

                        The name of an Glue usage profile associated
                        with the job run.

                      * **StateDetail** *(string) --*

                        This field holds details that pertain to the
                        state of a job run. The field is nullable.

                        For example, when a job run is in a WAITING
                        state as a result of job run queuing, the
                        field has the reason why the job run is in
                        that state.

                      * **ExecutionRoleSessionPolicy** *(string) --*

                        This inline session policy to the StartJobRun
                        API allows you to dynamically restrict the
                        permissions of the specified execution role
                        for the scope of the job, without requiring
                        the creation of additional IAM roles.

                * **CrawlerDetails** *(dict) --*

                  Details of the crawler when the node represents a
                  crawler.

                  * **Crawls** *(list) --*

                    A list of crawls represented by the crawl node.

                    * *(dict) --*

                      The details of a crawl in the workflow.

                      * **State** *(string) --*

                        The state of the crawler.

                      * **StartedOn** *(datetime) --*

                        The date and time on which the crawl started.

                      * **CompletedOn** *(datetime) --*

                        The date and time on which the crawl
                        completed.

                      * **ErrorMessage** *(string) --*

                        The error message associated with the crawl.

                      * **LogGroup** *(string) --*

                        The log group associated with the crawl.

                      * **LogStream** *(string) --*

                        The log stream associated with the crawl.

            * **Edges** *(list) --*

              A list of all the directed connections between the nodes
              belonging to the workflow.

              * *(dict) --*

                An edge represents a directed connection between two
                Glue components that are part of the workflow the edge
                belongs to.

                * **SourceId** *(string) --*

                  The unique of the node within the workflow where the
                  edge starts.

                * **DestinationId** *(string) --*

                  The unique of the node within the workflow where the
                  edge ends.

          * **MaxConcurrentRuns** *(integer) --*

            You can use this parameter to prevent unwanted multiple
            updates to data, to control costs, or in some cases, to
            prevent exceeding the maximum number of concurrent runs of
            any of the component jobs. If you leave this parameter
            blank, there is no limit to the number of concurrent
            workflow runs.

          * **BlueprintDetails** *(dict) --*

            This structure indicates the details of the blueprint that
            this particular workflow is created from.

            * **BlueprintName** *(string) --*

              The name of the blueprint.

            * **RunId** *(string) --*

              The run ID for this blueprint.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / list_statements


list_statements
***************

Glue.Client.list_statements(**kwargs)

   Lists statements for the session.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_statements(
          SessionId='string',
          RequestOrigin='string',
          NextToken='string'
      )

   Parameters:
      * **SessionId** (*string*) --

        **[REQUIRED]**

        The Session ID of the statements.

      * **RequestOrigin** (*string*) -- The origin of the request to
        list statements.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Statements': [
                 {
                     'Id': 123,
                     'Code': 'string',
                     'State': 'WAITING'|'RUNNING'|'AVAILABLE'|'CANCELLING'|'CANCELLED'|'ERROR',
                     'Output': {
                         'Data': {
                             'TextPlain': 'string'
                         },
                         'ExecutionCount': 123,
                         'Status': 'WAITING'|'RUNNING'|'AVAILABLE'|'CANCELLING'|'CANCELLED'|'ERROR',
                         'ErrorName': 'string',
                         'ErrorValue': 'string',
                         'Traceback': [
                             'string',
                         ]
                     },
                     'Progress': 123.0,
                     'StartedOn': 123,
                     'CompletedOn': 123
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Statements** *(list) --*

          Returns the list of statements.

          * *(dict) --*

            The statement or request for a particular action to occur
            in a session.

            * **Id** *(integer) --*

              The ID of the statement.

            * **Code** *(string) --*

              The execution code of the statement.

            * **State** *(string) --*

              The state while request is actioned.

            * **Output** *(dict) --*

              The output in JSON.

              * **Data** *(dict) --*

                The code execution output.

                * **TextPlain** *(string) --*

                  The code execution output in text format.

              * **ExecutionCount** *(integer) --*

                The execution count of the output.

              * **Status** *(string) --*

                The status of the code execution output.

              * **ErrorName** *(string) --*

                The name of the error in the output.

              * **ErrorValue** *(string) --*

                The error value of the output.

              * **Traceback** *(list) --*

                The traceback of the output.

                * *(string) --*

            * **Progress** *(float) --*

              The code execution progress.

            * **StartedOn** *(integer) --*

              The unix time and date that the job definition was
              started.

            * **CompletedOn** *(integer) --*

              The unix time and date that the job definition was
              completed.

        * **NextToken** *(string) --*

          A continuation token, if not all statements have yet been
          returned.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.IllegalSessionStateException"
Glue / Client / list_data_quality_rule_recommendation_runs


list_data_quality_rule_recommendation_runs
******************************************

Glue.Client.list_data_quality_rule_recommendation_runs(**kwargs)

   Lists the recommendation runs meeting the filter criteria.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_data_quality_rule_recommendation_runs(
          Filter={
              'DataSource': {
                  'GlueTable': {
                      'DatabaseName': 'string',
                      'TableName': 'string',
                      'CatalogId': 'string',
                      'ConnectionName': 'string',
                      'AdditionalOptions': {
                          'string': 'string'
                      }
                  }
              },
              'StartedBefore': datetime(2015, 1, 1),
              'StartedAfter': datetime(2015, 1, 1)
          },
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **Filter** (*dict*) --

        The filter criteria.

        * **DataSource** *(dict) --* **[REQUIRED]**

          Filter based on a specified data source (Glue table).

          * **GlueTable** *(dict) --* **[REQUIRED]**

            An Glue table.

            * **DatabaseName** *(string) --* **[REQUIRED]**

              A database name in the Glue Data Catalog.

            * **TableName** *(string) --* **[REQUIRED]**

              A table name in the Glue Data Catalog.

            * **CatalogId** *(string) --*

              A unique identifier for the Glue Data Catalog.

            * **ConnectionName** *(string) --*

              The name of the connection to the Glue Data Catalog.

            * **AdditionalOptions** *(dict) --*

              Additional options for the table. Currently there are
              two keys supported:

              * "pushDownPredicate": to filter on partitions without
                having to list and read all the files in your dataset.

              * "catalogPartitionPredicate": to use server-side
                partition pruning using partition indexes in the Glue
                Data Catalog.

              * *(string) --*

                * *(string) --*

        * **StartedBefore** *(datetime) --*

          Filter based on time for results started before provided
          time.

        * **StartedAfter** *(datetime) --*

          Filter based on time for results started after provided
          time.

      * **NextToken** (*string*) -- A paginated token to offset the
        results.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Runs': [
                 {
                     'RunId': 'string',
                     'Status': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT',
                     'StartedOn': datetime(2015, 1, 1),
                     'DataSource': {
                         'GlueTable': {
                             'DatabaseName': 'string',
                             'TableName': 'string',
                             'CatalogId': 'string',
                             'ConnectionName': 'string',
                             'AdditionalOptions': {
                                 'string': 'string'
                             }
                         }
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Runs** *(list) --*

          A list of "DataQualityRuleRecommendationRunDescription"
          objects.

          * *(dict) --*

            Describes the result of a data quality rule recommendation
            run.

            * **RunId** *(string) --*

              The unique run identifier associated with this run.

            * **Status** *(string) --*

              The status for this run.

            * **StartedOn** *(datetime) --*

              The date and time when this run started.

            * **DataSource** *(dict) --*

              The data source (Glue table) associated with the
              recommendation run.

              * **GlueTable** *(dict) --*

                An Glue table.

                * **DatabaseName** *(string) --*

                  A database name in the Glue Data Catalog.

                * **TableName** *(string) --*

                  A table name in the Glue Data Catalog.

                * **CatalogId** *(string) --*

                  A unique identifier for the Glue Data Catalog.

                * **ConnectionName** *(string) --*

                  The name of the connection to the Glue Data Catalog.

                * **AdditionalOptions** *(dict) --*

                  Additional options for the table. Currently there
                  are two keys supported:

                  * "pushDownPredicate": to filter on partitions
                    without having to list and read all the files in
                    your dataset.

                  * "catalogPartitionPredicate": to use server-side
                    partition pruning using partition indexes in the
                    Glue Data Catalog.

                  * *(string) --*

                    * *(string) --*

        * **NextToken** *(string) --*

          A pagination token, if more results are available.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / list_ml_transforms


list_ml_transforms
******************

Glue.Client.list_ml_transforms(**kwargs)

   Retrieves a sortable, filterable list of existing Glue machine
   learning transforms in this Amazon Web Services account, or the
   resources with the specified tag. This operation takes the optional
   "Tags" field, which you can use as a filter of the responses so
   that tagged resources can be retrieved as a group. If you choose to
   use tag filtering, only resources with the tags are retrieved.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_ml_transforms(
          NextToken='string',
          MaxResults=123,
          Filter={
              'Name': 'string',
              'TransformType': 'FIND_MATCHES',
              'Status': 'NOT_READY'|'READY'|'DELETING',
              'GlueVersion': 'string',
              'CreatedBefore': datetime(2015, 1, 1),
              'CreatedAfter': datetime(2015, 1, 1),
              'LastModifiedBefore': datetime(2015, 1, 1),
              'LastModifiedAfter': datetime(2015, 1, 1),
              'Schema': [
                  {
                      'Name': 'string',
                      'DataType': 'string'
                  },
              ]
          },
          Sort={
              'Column': 'NAME'|'TRANSFORM_TYPE'|'STATUS'|'CREATED'|'LAST_MODIFIED',
              'SortDirection': 'DESCENDING'|'ASCENDING'
          },
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

      * **Filter** (*dict*) --

        A "TransformFilterCriteria" used to filter the machine
        learning transforms.

        * **Name** *(string) --*

          A unique transform name that is used to filter the machine
          learning transforms.

        * **TransformType** *(string) --*

          The type of machine learning transform that is used to
          filter the machine learning transforms.

        * **Status** *(string) --*

          Filters the list of machine learning transforms by the last
          known status of the transforms (to indicate whether a
          transform can be used or not). One of "NOT_READY", "READY",
          or "DELETING".

        * **GlueVersion** *(string) --*

          This value determines which version of Glue this machine
          learning transform is compatible with. Glue 1.0 is
          recommended for most customers. If the value is not set, the
          Glue compatibility defaults to Glue 0.9. For more
          information, see Glue Versions in the developer guide.

        * **CreatedBefore** *(datetime) --*

          The time and date before which the transforms were created.

        * **CreatedAfter** *(datetime) --*

          The time and date after which the transforms were created.

        * **LastModifiedBefore** *(datetime) --*

          Filter on transforms last modified before this date.

        * **LastModifiedAfter** *(datetime) --*

          Filter on transforms last modified after this date.

        * **Schema** *(list) --*

          Filters on datasets with a specific schema. The "Map<Column,
          Type>" object is an array of key-value pairs representing
          the schema this transform accepts, where "Column" is the
          name of a column, and "Type" is the type of the data such as
          an integer or string. Has an upper bound of 100 columns.

          * *(dict) --*

            A key-value pair representing a column and data type that
            this transform can run against. The "Schema" parameter of
            the "MLTransform" may contain up to 100 of these
            structures.

            * **Name** *(string) --*

              The name of the column.

            * **DataType** *(string) --*

              The type of data in the column.

      * **Sort** (*dict*) --

        A "TransformSortCriteria" used to sort the machine learning
        transforms.

        * **Column** *(string) --* **[REQUIRED]**

          The column to be used in the sorting criteria that are
          associated with the machine learning transform.

        * **SortDirection** *(string) --* **[REQUIRED]**

          The sort direction to be used in the sorting criteria that
          are associated with the machine learning transform.

      * **Tags** (*dict*) --

        Specifies to return only these tagged resources.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TransformIds': [
                 'string',
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TransformIds** *(list) --*

          The identifiers of all the machine learning transforms in
          the account, or the machine learning transforms with the
          specified tags.

          * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if the returned list does not contain
          the last metric available.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / get_table


get_table
*********

Glue.Client.get_table(**kwargs)

   Retrieves the "Table" definition in a Data Catalog for a specified
   table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_table(
          CatalogId='string',
          DatabaseName='string',
          Name='string',
          TransactionId='string',
          QueryAsOfTime=datetime(2015, 1, 1),
          IncludeStatusDetails=True|False
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the table resides. If none is provided, the Amazon Web
        Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database in the catalog in which the table
        resides. For Hive compatibility, this name is entirely
        lowercase.

      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the table for which to retrieve the definition.
        For Hive compatibility, this name is entirely lowercase.

      * **TransactionId** (*string*) -- The transaction ID at which to
        read the table contents.

      * **QueryAsOfTime** (*datetime*) -- The time as of when to read
        the table contents. If not set, the most recent transaction
        commit time will be used. Cannot be specified along with
        "TransactionId".

      * **IncludeStatusDetails** (*boolean*) -- Specifies whether to
        include status details related to a request to create or
        update an Glue Data Catalog view.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Table': {
                 'Name': 'string',
                 'DatabaseName': 'string',
                 'Description': 'string',
                 'Owner': 'string',
                 'CreateTime': datetime(2015, 1, 1),
                 'UpdateTime': datetime(2015, 1, 1),
                 'LastAccessTime': datetime(2015, 1, 1),
                 'LastAnalyzedTime': datetime(2015, 1, 1),
                 'Retention': 123,
                 'StorageDescriptor': {
                     'Columns': [
                         {
                             'Name': 'string',
                             'Type': 'string',
                             'Comment': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                     ],
                     'Location': 'string',
                     'AdditionalLocations': [
                         'string',
                     ],
                     'InputFormat': 'string',
                     'OutputFormat': 'string',
                     'Compressed': True|False,
                     'NumberOfBuckets': 123,
                     'SerdeInfo': {
                         'Name': 'string',
                         'SerializationLibrary': 'string',
                         'Parameters': {
                             'string': 'string'
                         }
                     },
                     'BucketColumns': [
                         'string',
                     ],
                     'SortColumns': [
                         {
                             'Column': 'string',
                             'SortOrder': 123
                         },
                     ],
                     'Parameters': {
                         'string': 'string'
                     },
                     'SkewedInfo': {
                         'SkewedColumnNames': [
                             'string',
                         ],
                         'SkewedColumnValues': [
                             'string',
                         ],
                         'SkewedColumnValueLocationMaps': {
                             'string': 'string'
                         }
                     },
                     'StoredAsSubDirectories': True|False,
                     'SchemaReference': {
                         'SchemaId': {
                             'SchemaArn': 'string',
                             'SchemaName': 'string',
                             'RegistryName': 'string'
                         },
                         'SchemaVersionId': 'string',
                         'SchemaVersionNumber': 123
                     }
                 },
                 'PartitionKeys': [
                     {
                         'Name': 'string',
                         'Type': 'string',
                         'Comment': 'string',
                         'Parameters': {
                             'string': 'string'
                         }
                     },
                 ],
                 'ViewOriginalText': 'string',
                 'ViewExpandedText': 'string',
                 'TableType': 'string',
                 'Parameters': {
                     'string': 'string'
                 },
                 'CreatedBy': 'string',
                 'IsRegisteredWithLakeFormation': True|False,
                 'TargetTable': {
                     'CatalogId': 'string',
                     'DatabaseName': 'string',
                     'Name': 'string',
                     'Region': 'string'
                 },
                 'CatalogId': 'string',
                 'VersionId': 'string',
                 'FederatedTable': {
                     'Identifier': 'string',
                     'DatabaseIdentifier': 'string',
                     'ConnectionName': 'string',
                     'ConnectionType': 'string'
                 },
                 'ViewDefinition': {
                     'IsProtected': True|False,
                     'Definer': 'string',
                     'SubObjects': [
                         'string',
                     ],
                     'Representations': [
                         {
                             'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                             'DialectVersion': 'string',
                             'ViewOriginalText': 'string',
                             'ViewExpandedText': 'string',
                             'ValidationConnection': 'string',
                             'IsStale': True|False
                         },
                     ]
                 },
                 'IsMultiDialectView': True|False,
                 'Status': {
                     'RequestedBy': 'string',
                     'UpdatedBy': 'string',
                     'RequestTime': datetime(2015, 1, 1),
                     'UpdateTime': datetime(2015, 1, 1),
                     'Action': 'UPDATE'|'CREATE',
                     'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                     'Error': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     },
                     'Details': {
                         'RequestedChange': {'... recursive ...'},
                         'ViewValidations': [
                             {
                                 'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                 'DialectVersion': 'string',
                                 'ViewValidationText': 'string',
                                 'UpdateTime': datetime(2015, 1, 1),
                                 'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                                 'Error': {
                                     'ErrorCode': 'string',
                                     'ErrorMessage': 'string'
                                 }
                             },
                         ]
                     }
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Table** *(dict) --*

          The "Table" object that defines the specified table.

          * **Name** *(string) --*

            The table name. For Hive compatibility, this must be
            entirely lowercase.

          * **DatabaseName** *(string) --*

            The name of the database where the table metadata resides.
            For Hive compatibility, this must be all lowercase.

          * **Description** *(string) --*

            A description of the table.

          * **Owner** *(string) --*

            The owner of the table.

          * **CreateTime** *(datetime) --*

            The time when the table definition was created in the Data
            Catalog.

          * **UpdateTime** *(datetime) --*

            The last time that the table was updated.

          * **LastAccessTime** *(datetime) --*

            The last time that the table was accessed. This is usually
            taken from HDFS, and might not be reliable.

          * **LastAnalyzedTime** *(datetime) --*

            The last time that column statistics were computed for
            this table.

          * **Retention** *(integer) --*

            The retention time for this table.

          * **StorageDescriptor** *(dict) --*

            A storage descriptor containing information about the
            physical storage of this table.

            * **Columns** *(list) --*

              A list of the "Columns" in the table.

              * *(dict) --*

                A column in a "Table".

                * **Name** *(string) --*

                  The name of the "Column".

                * **Type** *(string) --*

                  The data type of the "Column".

                * **Comment** *(string) --*

                  A free-form text comment.

                * **Parameters** *(dict) --*

                  These key-value pairs define properties associated
                  with the column.

                  * *(string) --*

                    * *(string) --*

            * **Location** *(string) --*

              The physical location of the table. By default, this
              takes the form of the warehouse location, followed by
              the database location in the warehouse, followed by the
              table name.

            * **AdditionalLocations** *(list) --*

              A list of locations that point to the path where a Delta
              table is located.

              * *(string) --*

            * **InputFormat** *(string) --*

              The input format: "SequenceFileInputFormat" (binary), or
              "TextInputFormat", or a custom format.

            * **OutputFormat** *(string) --*

              The output format: "SequenceFileOutputFormat" (binary),
              or "IgnoreKeyTextOutputFormat", or a custom format.

            * **Compressed** *(boolean) --*

              "True" if the data in the table is compressed, or
              "False" if not.

            * **NumberOfBuckets** *(integer) --*

              Must be specified if the table contains any dimension
              columns.

            * **SerdeInfo** *(dict) --*

              The serialization/deserialization (SerDe) information.

              * **Name** *(string) --*

                Name of the SerDe.

              * **SerializationLibrary** *(string) --*

                Usually the class that implements the SerDe. An
                example is "org.apache.hadoop.hive.serde2.columnar.Co
                lumnarSerDe".

              * **Parameters** *(dict) --*

                These key-value pairs define initialization parameters
                for the SerDe.

                * *(string) --*

                  * *(string) --*

            * **BucketColumns** *(list) --*

              A list of reducer grouping columns, clustering columns,
              and bucketing columns in the table.

              * *(string) --*

            * **SortColumns** *(list) --*

              A list specifying the sort order of each bucket in the
              table.

              * *(dict) --*

                Specifies the sort order of a sorted column.

                * **Column** *(string) --*

                  The name of the column.

                * **SortOrder** *(integer) --*

                  Indicates that the column is sorted in ascending
                  order ( "== 1"), or in descending order ( "==0").

            * **Parameters** *(dict) --*

              The user-supplied properties in key-value form.

              * *(string) --*

                * *(string) --*

            * **SkewedInfo** *(dict) --*

              The information about values that appear frequently in a
              column (skewed values).

              * **SkewedColumnNames** *(list) --*

                A list of names of columns that contain skewed values.

                * *(string) --*

              * **SkewedColumnValues** *(list) --*

                A list of values that appear so frequently as to be
                considered skewed.

                * *(string) --*

              * **SkewedColumnValueLocationMaps** *(dict) --*

                A mapping of skewed values to the columns that contain
                them.

                * *(string) --*

                  * *(string) --*

            * **StoredAsSubDirectories** *(boolean) --*

              "True" if the table data is stored in subdirectories, or
              "False" if not.

            * **SchemaReference** *(dict) --*

              An object that references a schema stored in the Glue
              Schema Registry.

              When creating a table, you can pass an empty list of
              columns for the schema, and instead use a schema
              reference.

              * **SchemaId** *(dict) --*

                A structure that contains schema identity fields.
                Either this or the "SchemaVersionId" has to be
                provided.

                * **SchemaArn** *(string) --*

                  The Amazon Resource Name (ARN) of the schema. One of
                  "SchemaArn" or "SchemaName" has to be provided.

                * **SchemaName** *(string) --*

                  The name of the schema. One of "SchemaArn" or
                  "SchemaName" has to be provided.

                * **RegistryName** *(string) --*

                  The name of the schema registry that contains the
                  schema.

              * **SchemaVersionId** *(string) --*

                The unique ID assigned to a version of the schema.
                Either this or the "SchemaId" has to be provided.

              * **SchemaVersionNumber** *(integer) --*

                The version number of the schema.

          * **PartitionKeys** *(list) --*

            A list of columns by which the table is partitioned. Only
            primitive types are supported as partition keys.

            When you create a table used by Amazon Athena, and you do
            not specify any "partitionKeys", you must at least set the
            value of "partitionKeys" to an empty list. For example:

            ""PartitionKeys": []"

            * *(dict) --*

              A column in a "Table".

              * **Name** *(string) --*

                The name of the "Column".

              * **Type** *(string) --*

                The data type of the "Column".

              * **Comment** *(string) --*

                A free-form text comment.

              * **Parameters** *(dict) --*

                These key-value pairs define properties associated
                with the column.

                * *(string) --*

                  * *(string) --*

          * **ViewOriginalText** *(string) --*

            Included for Apache Hive compatibility. Not used in the
            normal course of Glue operations. If the table is a
            "VIRTUAL_VIEW", certain Athena configuration encoded in
            base64.

          * **ViewExpandedText** *(string) --*

            Included for Apache Hive compatibility. Not used in the
            normal course of Glue operations.

          * **TableType** *(string) --*

            The type of this table. Glue will create tables with the
            "EXTERNAL_TABLE" type. Other services, such as Athena, may
            create tables with additional table types.

            Glue related table types:

               EXTERNAL_TABLE

            Hive compatible attribute - indicates a non-Hive managed
            table.

               GOVERNED

            Used by Lake Formation. The Glue Data Catalog understands
            "GOVERNED".

          * **Parameters** *(dict) --*

            These key-value pairs define properties associated with
            the table.

            * *(string) --*

              * *(string) --*

          * **CreatedBy** *(string) --*

            The person or entity who created the table.

          * **IsRegisteredWithLakeFormation** *(boolean) --*

            Indicates whether the table has been registered with Lake
            Formation.

          * **TargetTable** *(dict) --*

            A "TableIdentifier" structure that describes a target
            table for resource linking.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the table resides.

            * **DatabaseName** *(string) --*

              The name of the catalog database that contains the
              target table.

            * **Name** *(string) --*

              The name of the target table.

            * **Region** *(string) --*

              Region of the target table.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the table resides.

          * **VersionId** *(string) --*

            The ID of the table version.

          * **FederatedTable** *(dict) --*

            A "FederatedTable" structure that references an entity
            outside the Glue Data Catalog.

            * **Identifier** *(string) --*

              A unique identifier for the federated table.

            * **DatabaseIdentifier** *(string) --*

              A unique identifier for the federated database.

            * **ConnectionName** *(string) --*

              The name of the connection to the external metastore.

            * **ConnectionType** *(string) --*

              The type of connection used to access the federated
              table, specifying the protocol or method for connecting
              to the external data source.

          * **ViewDefinition** *(dict) --*

            A structure that contains all the information that defines
            the view, including the dialect or dialects for the view,
            and the query.

            * **IsProtected** *(boolean) --*

              You can set this flag as true to instruct the engine not
              to push user-provided operations into the logical plan
              of the view during query planning. However, setting this
              flag does not guarantee that the engine will comply.
              Refer to the engine's documentation to understand the
              guarantees provided, if any.

            * **Definer** *(string) --*

              The definer of a view in SQL.

            * **SubObjects** *(list) --*

              A list of table Amazon Resource Names (ARNs).

              * *(string) --*

            * **Representations** *(list) --*

              A list of representations.

              * *(dict) --*

                A structure that contains the dialect of the view, and
                the query that defines the view.

                * **Dialect** *(string) --*

                  The dialect of the query engine.

                * **DialectVersion** *(string) --*

                  The version of the dialect of the query engine. For
                  example, 3.0.0.

                * **ViewOriginalText** *(string) --*

                  The "SELECT" query provided by the customer during
                  "CREATE VIEW DDL". This SQL is not used during a
                  query on a view ( "ViewExpandedText" is used
                  instead). "ViewOriginalText" is used for cases like
                  "SHOW CREATE VIEW" where users want to see the
                  original DDL command that created the view.

                * **ViewExpandedText** *(string) --*

                  The expanded SQL for the view. This SQL is used by
                  engines while processing a query on a view. Engines
                  may perform operations during view creation to
                  transform "ViewOriginalText" to "ViewExpandedText".
                  For example:

                  * Fully qualified identifiers: "SELECT * from table1
                    -> SELECT * from db1.table1"

                * **ValidationConnection** *(string) --*

                  The name of the connection to be used to validate
                  the specific representation of the view.

                * **IsStale** *(boolean) --*

                  Dialects marked as stale are no longer valid and
                  must be updated before they can be queried in their
                  respective query engines.

          * **IsMultiDialectView** *(boolean) --*

            Specifies whether the view supports the SQL dialects of
            one or more different query engines and can therefore be
            read by those engines.

          * **Status** *(dict) --*

            A structure containing information about the state of an
            asynchronous change to a table.

            * **RequestedBy** *(string) --*

              The ARN of the user who requested the asynchronous
              change.

            * **UpdatedBy** *(string) --*

              The ARN of the user to last manually alter the
              asynchronous change (requesting cancellation, etc).

            * **RequestTime** *(datetime) --*

              An ISO 8601 formatted date string indicating the time
              that the change was initiated.

            * **UpdateTime** *(datetime) --*

              An ISO 8601 formatted date string indicating the time
              that the state was last updated.

            * **Action** *(string) --*

              Indicates which action was called on the table,
              currently only "CREATE" or "UPDATE".

            * **State** *(string) --*

              A generic status for the change in progress, such as
              QUEUED, IN_PROGRESS, SUCCESS, or FAILED.

            * **Error** *(dict) --*

              An error that will only appear when the state is
              "FAILED". This is a parent level exception message,
              there may be different >>``<<Error``s for each dialect.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

            * **Details** *(dict) --*

              A "StatusDetails" object with information about the
              requested change.

              * **RequestedChange** *(dict) --*

                A "Table" object representing the requested changes.

              * **ViewValidations** *(list) --*

                A list of "ViewValidation" objects that contain
                information for an analytical engine to validate a
                view.

                * *(dict) --*

                  A structure that contains information for an
                  analytical engine to validate a view, prior to
                  persisting the view metadata. Used in the case of
                  direct "UpdateTable" or "CreateTable" API calls.

                  * **Dialect** *(string) --*

                    The dialect of the query engine.

                  * **DialectVersion** *(string) --*

                    The version of the dialect of the query engine.
                    For example, 3.0.0.

                  * **ViewValidationText** *(string) --*

                    The "SELECT" query that defines the view, as
                    provided by the customer.

                  * **UpdateTime** *(datetime) --*

                    The time of the last update.

                  * **State** *(string) --*

                    The state of the validation.

                  * **Error** *(dict) --*

                    An error associated with the validation.

                    * **ErrorCode** *(string) --*

                      The code associated with this error.

                    * **ErrorMessage** *(string) --*

                      A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ResourceNotReadyException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / list_data_quality_statistics


list_data_quality_statistics
****************************

Glue.Client.list_data_quality_statistics(**kwargs)

   Retrieves a list of data quality statistics.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_data_quality_statistics(
          StatisticId='string',
          ProfileId='string',
          TimestampFilter={
              'RecordedBefore': datetime(2015, 1, 1),
              'RecordedAfter': datetime(2015, 1, 1)
          },
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **StatisticId** (*string*) -- The Statistic ID.

      * **ProfileId** (*string*) -- The Profile ID.

      * **TimestampFilter** (*dict*) --

        A timestamp filter.

        * **RecordedBefore** *(datetime) --*

          The timestamp before which statistics should be included in
          the results.

        * **RecordedAfter** *(datetime) --*

          The timestamp after which statistics should be included in
          the results.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return in this request.

      * **NextToken** (*string*) -- A pagination token to request the
        next page of results.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Statistics': [
                 {
                     'StatisticId': 'string',
                     'ProfileId': 'string',
                     'RunIdentifier': {
                         'RunId': 'string',
                         'JobRunId': 'string'
                     },
                     'StatisticName': 'string',
                     'DoubleValue': 123.0,
                     'EvaluationLevel': 'Dataset'|'Column'|'Multicolumn',
                     'ColumnsReferenced': [
                         'string',
                     ],
                     'ReferencedDatasets': [
                         'string',
                     ],
                     'StatisticProperties': {
                         'string': 'string'
                     },
                     'RecordedOn': datetime(2015, 1, 1),
                     'InclusionAnnotation': {
                         'Value': 'INCLUDE'|'EXCLUDE',
                         'LastModifiedOn': datetime(2015, 1, 1)
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Statistics** *(list) --*

          A "StatisticSummaryList".

          * *(dict) --*

            Summary information about a statistic.

            * **StatisticId** *(string) --*

              The Statistic ID.

            * **ProfileId** *(string) --*

              The Profile ID.

            * **RunIdentifier** *(dict) --*

              The Run Identifier

              * **RunId** *(string) --*

                The Run ID.

              * **JobRunId** *(string) --*

                The Job Run ID.

            * **StatisticName** *(string) --*

              The name of the statistic.

            * **DoubleValue** *(float) --*

              The value of the statistic.

            * **EvaluationLevel** *(string) --*

              The evaluation level of the statistic. Possible values:
              "Dataset", "Column", "Multicolumn".

            * **ColumnsReferenced** *(list) --*

              The list of columns referenced by the statistic.

              * *(string) --*

            * **ReferencedDatasets** *(list) --*

              The list of datasets referenced by the statistic.

              * *(string) --*

            * **StatisticProperties** *(dict) --*

              A "StatisticPropertiesMap", which contains a
              "NameString" and "DescriptionString"

              * *(string) --*

                * *(string) --*

            * **RecordedOn** *(datetime) --*

              The timestamp when the statistic was recorded.

            * **InclusionAnnotation** *(dict) --*

              The inclusion annotation for the statistic.

              * **Value** *(string) --*

                The inclusion annotation value.

              * **LastModifiedOn** *(datetime) --*

                The timestamp when the inclusion annotation was last
                modified.

        * **NextToken** *(string) --*

          A pagination token to request the next page of results.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / create_session


create_session
**************

Glue.Client.create_session(**kwargs)

   Creates a new session.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_session(
          Id='string',
          Description='string',
          Role='string',
          Command={
              'Name': 'string',
              'PythonVersion': 'string'
          },
          Timeout=123,
          IdleTimeout=123,
          DefaultArguments={
              'string': 'string'
          },
          Connections={
              'Connections': [
                  'string',
              ]
          },
          MaxCapacity=123.0,
          NumberOfWorkers=123,
          WorkerType='Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
          SecurityConfiguration='string',
          GlueVersion='string',
          Tags={
              'string': 'string'
          },
          RequestOrigin='string'
      )

   Parameters:
      * **Id** (*string*) --

        **[REQUIRED]**

        The ID of the session request.

      * **Description** (*string*) -- The description of the session.

      * **Role** (*string*) --

        **[REQUIRED]**

        The IAM Role ARN

      * **Command** (*dict*) --

        **[REQUIRED]**

        The "SessionCommand" that runs the job.

        * **Name** *(string) --*

          Specifies the name of the SessionCommand. Can be 'glueetl'
          or 'gluestreaming'.

        * **PythonVersion** *(string) --*

          Specifies the Python version. The Python version indicates
          the version supported for jobs of type Spark.

      * **Timeout** (*integer*) -- The number of minutes before
        session times out. Default for Spark ETL jobs is 48 hours
        (2880 minutes). Consult the documentation for other job types.

      * **IdleTimeout** (*integer*) -- The number of minutes when idle
        before session times out. Default for Spark ETL jobs is value
        of Timeout. Consult the documentation for other job types.

      * **DefaultArguments** (*dict*) --

        A map array of key-value pairs. Max is 75 pairs.

        * *(string) --*

          * *(string) --*

      * **Connections** (*dict*) --

        The number of connections to use for the session.

        * **Connections** *(list) --*

          A list of connections used by the job.

          * *(string) --*

      * **MaxCapacity** (*float*) -- The number of Glue data
        processing units (DPUs) that can be allocated when the job
        runs. A DPU is a relative measure of processing power that
        consists of 4 vCPUs of compute capacity and 16 GB memory.

      * **NumberOfWorkers** (*integer*) -- The number of workers of a
        defined "WorkerType" to use for the session.

      * **WorkerType** (*string*) --

        The type of predefined worker that is allocated when a job
        runs. Accepts a value of G.1X, G.2X, G.4X, or G.8X for Spark
        jobs. Accepts the value Z.2X for Ray notebooks.

        * For the "G.1X" worker type, each worker maps to 1 DPU (4
          vCPUs, 16 GB of memory) with 94GB disk, and provides 1
          executor per worker. We recommend this worker type for
          workloads such as data transforms, joins, and queries, to
          offers a scalable and cost effective way to run most jobs.

        * For the "G.2X" worker type, each worker maps to 2 DPU (8
          vCPUs, 32 GB of memory) with 138GB disk, and provides 1
          executor per worker. We recommend this worker type for
          workloads such as data transforms, joins, and queries, to
          offers a scalable and cost effective way to run most jobs.

        * For the "G.4X" worker type, each worker maps to 4 DPU (16
          vCPUs, 64 GB of memory) with 256GB disk, and provides 1
          executor per worker. We recommend this worker type for jobs
          whose workloads contain your most demanding transforms,
          aggregations, joins, and queries. This worker type is
          available only for Glue version 3.0 or later Spark ETL jobs
          in the following Amazon Web Services Regions: US East
          (Ohio), US East (N. Virginia), US West (Oregon), Asia
          Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific
          (Tokyo), Canada (Central), Europe (Frankfurt), Europe
          (Ireland), and Europe (Stockholm).

        * For the "G.8X" worker type, each worker maps to 8 DPU (32
          vCPUs, 128 GB of memory) with 512GB disk, and provides 1
          executor per worker. We recommend this worker type for jobs
          whose workloads contain your most demanding transforms,
          aggregations, joins, and queries. This worker type is
          available only for Glue version 3.0 or later Spark ETL jobs,
          in the same Amazon Web Services Regions as supported for the
          "G.4X" worker type.

        * For the "Z.2X" worker type, each worker maps to 2 M-DPU
          (8vCPUs, 64 GB of memory) with 128 GB disk, and provides up
          to 8 Ray workers based on the autoscaler.

      * **SecurityConfiguration** (*string*) -- The name of the
        SecurityConfiguration structure to be used with the session

      * **GlueVersion** (*string*) -- The Glue version determines the
        versions of Apache Spark and Python that Glue supports. The
        GlueVersion must be greater than 2.0.

      * **Tags** (*dict*) --

        The map of key value pairs (tags) belonging to the session.

        * *(string) --*

          * *(string) --*

      * **RequestOrigin** (*string*) -- The origin of the request.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Session': {
                 'Id': 'string',
                 'CreatedOn': datetime(2015, 1, 1),
                 'Status': 'PROVISIONING'|'READY'|'FAILED'|'TIMEOUT'|'STOPPING'|'STOPPED',
                 'ErrorMessage': 'string',
                 'Description': 'string',
                 'Role': 'string',
                 'Command': {
                     'Name': 'string',
                     'PythonVersion': 'string'
                 },
                 'DefaultArguments': {
                     'string': 'string'
                 },
                 'Connections': {
                     'Connections': [
                         'string',
                     ]
                 },
                 'Progress': 123.0,
                 'MaxCapacity': 123.0,
                 'SecurityConfiguration': 'string',
                 'GlueVersion': 'string',
                 'NumberOfWorkers': 123,
                 'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                 'CompletedOn': datetime(2015, 1, 1),
                 'ExecutionTime': 123.0,
                 'DPUSeconds': 123.0,
                 'IdleTimeout': 123,
                 'ProfileName': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Session** *(dict) --*

          Returns the session object in the response.

          * **Id** *(string) --*

            The ID of the session.

          * **CreatedOn** *(datetime) --*

            The time and date when the session was created.

          * **Status** *(string) --*

            The session status.

          * **ErrorMessage** *(string) --*

            The error message displayed during the session.

          * **Description** *(string) --*

            The description of the session.

          * **Role** *(string) --*

            The name or Amazon Resource Name (ARN) of the IAM role
            associated with the Session.

          * **Command** *(dict) --*

            The command object.See SessionCommand.

            * **Name** *(string) --*

              Specifies the name of the SessionCommand. Can be
              'glueetl' or 'gluestreaming'.

            * **PythonVersion** *(string) --*

              Specifies the Python version. The Python version
              indicates the version supported for jobs of type Spark.

          * **DefaultArguments** *(dict) --*

            A map array of key-value pairs. Max is 75 pairs.

            * *(string) --*

              * *(string) --*

          * **Connections** *(dict) --*

            The number of connections used for the session.

            * **Connections** *(list) --*

              A list of connections used by the job.

              * *(string) --*

          * **Progress** *(float) --*

            The code execution progress of the session.

          * **MaxCapacity** *(float) --*

            The number of Glue data processing units (DPUs) that can
            be allocated when the job runs. A DPU is a relative
            measure of processing power that consists of 4 vCPUs of
            compute capacity and 16 GB memory.

          * **SecurityConfiguration** *(string) --*

            The name of the SecurityConfiguration structure to be used
            with the session.

          * **GlueVersion** *(string) --*

            The Glue version determines the versions of Apache Spark
            and Python that Glue supports. The GlueVersion must be
            greater than 2.0.

          * **NumberOfWorkers** *(integer) --*

            The number of workers of a defined "WorkerType" to use for
            the session.

          * **WorkerType** *(string) --*

            The type of predefined worker that is allocated when a
            session runs. Accepts a value of "G.1X", "G.2X", "G.4X",
            or "G.8X" for Spark sessions. Accepts the value "Z.2X" for
            Ray sessions.

          * **CompletedOn** *(datetime) --*

            The date and time that this session is completed.

          * **ExecutionTime** *(float) --*

            The total time the session ran for.

          * **DPUSeconds** *(float) --*

            The DPUs consumed by the session (formula: ExecutionTime *
            MaxCapacity).

          * **IdleTimeout** *(integer) --*

            The number of minutes when idle before the session times
            out.

          * **ProfileName** *(string) --*

            The name of an Glue usage profile associated with the
            session.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.IdempotentParameterMismatchException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / search_tables


search_tables
*************

Glue.Client.search_tables(**kwargs)

   Searches a set of tables based on properties in the table metadata
   as well as on the parent database. You can search against text or
   filter conditions.

   You can only get tables that you have access to based on the
   security policies defined in Lake Formation. You need at least a
   read-only access to the table for it to be returned. If you do not
   have access to all the columns in the table, these columns will not
   be searched against when returning the list of tables back to you.
   If you have access to the columns but not the data in the columns,
   those columns and the associated metadata for those columns will be
   included in the search.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.search_tables(
          CatalogId='string',
          NextToken='string',
          Filters=[
              {
                  'Key': 'string',
                  'Value': 'string',
                  'Comparator': 'EQUALS'|'GREATER_THAN'|'LESS_THAN'|'GREATER_THAN_EQUALS'|'LESS_THAN_EQUALS'
              },
          ],
          SearchText='string',
          SortCriteria=[
              {
                  'FieldName': 'string',
                  'Sort': 'ASC'|'DESC'
              },
          ],
          MaxResults=123,
          ResourceShareType='FOREIGN'|'ALL'|'FEDERATED',
          IncludeStatusDetails=True|False
      )

   Parameters:
      * **CatalogId** (*string*) -- A unique identifier, consisting of
        >>``<<account_id >>``<<.

      * **NextToken** (*string*) -- A continuation token, included if
        this is a continuation call.

      * **Filters** (*list*) --

        A list of key-value pairs, and a comparator used to filter the
        search results. Returns all entities matching the predicate.

        The "Comparator" member of the "PropertyPredicate" struct is
        used only for time fields, and can be omitted for other field
        types. Also, when comparing string values, such as when
        "Key=Name", a fuzzy match algorithm is used. The "Key" field
        (for example, the value of the "Name" field) is split on
        certain punctuation characters, for example, -, :, #, etc.
        into tokens. Then each token is exact-match compared with the
        "Value" member of "PropertyPredicate". For example, if
        "Key=Name" and "Value=link", tables named "customer-link" and
        "xx-link-yy" are returned, but "xxlinkyy" is not returned.

        * *(dict) --*

          Defines a property predicate.

          * **Key** *(string) --*

            The key of the property.

          * **Value** *(string) --*

            The value of the property.

          * **Comparator** *(string) --*

            The comparator used to compare this property to others.

      * **SearchText** (*string*) --

        A string used for a text search.

        Specifying a value in quotes filters based on an exact match
        to the value.

      * **SortCriteria** (*list*) --

        A list of criteria for sorting the results by a field name, in
        an ascending or descending order.

        * *(dict) --*

          Specifies a field to sort by and a sort order.

          * **FieldName** *(string) --*

            The name of the field on which to sort.

          * **Sort** *(string) --*

            An ascending or descending sort.

      * **MaxResults** (*integer*) -- The maximum number of tables to
        return in a single response.

      * **ResourceShareType** (*string*) --

        Allows you to specify that you want to search the tables
        shared with your account. The allowable values are "FOREIGN"
        or "ALL".

        * If set to "FOREIGN", will search the tables shared with your
          account.

        * If set to "ALL", will search the tables shared with your
          account, as well as the tables in yor local account.

      * **IncludeStatusDetails** (*boolean*) -- Specifies whether to
        include status details related to a request to create or
        update an Glue Data Catalog view.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'NextToken': 'string',
             'TableList': [
                 {
                     'Name': 'string',
                     'DatabaseName': 'string',
                     'Description': 'string',
                     'Owner': 'string',
                     'CreateTime': datetime(2015, 1, 1),
                     'UpdateTime': datetime(2015, 1, 1),
                     'LastAccessTime': datetime(2015, 1, 1),
                     'LastAnalyzedTime': datetime(2015, 1, 1),
                     'Retention': 123,
                     'StorageDescriptor': {
                         'Columns': [
                             {
                                 'Name': 'string',
                                 'Type': 'string',
                                 'Comment': 'string',
                                 'Parameters': {
                                     'string': 'string'
                                 }
                             },
                         ],
                         'Location': 'string',
                         'AdditionalLocations': [
                             'string',
                         ],
                         'InputFormat': 'string',
                         'OutputFormat': 'string',
                         'Compressed': True|False,
                         'NumberOfBuckets': 123,
                         'SerdeInfo': {
                             'Name': 'string',
                             'SerializationLibrary': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                         'BucketColumns': [
                             'string',
                         ],
                         'SortColumns': [
                             {
                                 'Column': 'string',
                                 'SortOrder': 123
                             },
                         ],
                         'Parameters': {
                             'string': 'string'
                         },
                         'SkewedInfo': {
                             'SkewedColumnNames': [
                                 'string',
                             ],
                             'SkewedColumnValues': [
                                 'string',
                             ],
                             'SkewedColumnValueLocationMaps': {
                                 'string': 'string'
                             }
                         },
                         'StoredAsSubDirectories': True|False,
                         'SchemaReference': {
                             'SchemaId': {
                                 'SchemaArn': 'string',
                                 'SchemaName': 'string',
                                 'RegistryName': 'string'
                             },
                             'SchemaVersionId': 'string',
                             'SchemaVersionNumber': 123
                         }
                     },
                     'PartitionKeys': [
                         {
                             'Name': 'string',
                             'Type': 'string',
                             'Comment': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                     ],
                     'ViewOriginalText': 'string',
                     'ViewExpandedText': 'string',
                     'TableType': 'string',
                     'Parameters': {
                         'string': 'string'
                     },
                     'CreatedBy': 'string',
                     'IsRegisteredWithLakeFormation': True|False,
                     'TargetTable': {
                         'CatalogId': 'string',
                         'DatabaseName': 'string',
                         'Name': 'string',
                         'Region': 'string'
                     },
                     'CatalogId': 'string',
                     'VersionId': 'string',
                     'FederatedTable': {
                         'Identifier': 'string',
                         'DatabaseIdentifier': 'string',
                         'ConnectionName': 'string',
                         'ConnectionType': 'string'
                     },
                     'ViewDefinition': {
                         'IsProtected': True|False,
                         'Definer': 'string',
                         'SubObjects': [
                             'string',
                         ],
                         'Representations': [
                             {
                                 'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                 'DialectVersion': 'string',
                                 'ViewOriginalText': 'string',
                                 'ViewExpandedText': 'string',
                                 'ValidationConnection': 'string',
                                 'IsStale': True|False
                             },
                         ]
                     },
                     'IsMultiDialectView': True|False,
                     'Status': {
                         'RequestedBy': 'string',
                         'UpdatedBy': 'string',
                         'RequestTime': datetime(2015, 1, 1),
                         'UpdateTime': datetime(2015, 1, 1),
                         'Action': 'UPDATE'|'CREATE',
                         'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                         'Error': {
                             'ErrorCode': 'string',
                             'ErrorMessage': 'string'
                         },
                         'Details': {
                             'RequestedChange': {'... recursive ...'},
                             'ViewValidations': [
                                 {
                                     'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                     'DialectVersion': 'string',
                                     'ViewValidationText': 'string',
                                     'UpdateTime': datetime(2015, 1, 1),
                                     'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                                     'Error': {
                                         'ErrorCode': 'string',
                                         'ErrorMessage': 'string'
                                     }
                                 },
                             ]
                         }
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **NextToken** *(string) --*

          A continuation token, present if the current list segment is
          not the last.

        * **TableList** *(list) --*

          A list of the requested "Table" objects. The "SearchTables"
          response returns only the tables that you have access to.

          * *(dict) --*

            Represents a collection of related data organized in
            columns and rows.

            * **Name** *(string) --*

              The table name. For Hive compatibility, this must be
              entirely lowercase.

            * **DatabaseName** *(string) --*

              The name of the database where the table metadata
              resides. For Hive compatibility, this must be all
              lowercase.

            * **Description** *(string) --*

              A description of the table.

            * **Owner** *(string) --*

              The owner of the table.

            * **CreateTime** *(datetime) --*

              The time when the table definition was created in the
              Data Catalog.

            * **UpdateTime** *(datetime) --*

              The last time that the table was updated.

            * **LastAccessTime** *(datetime) --*

              The last time that the table was accessed. This is
              usually taken from HDFS, and might not be reliable.

            * **LastAnalyzedTime** *(datetime) --*

              The last time that column statistics were computed for
              this table.

            * **Retention** *(integer) --*

              The retention time for this table.

            * **StorageDescriptor** *(dict) --*

              A storage descriptor containing information about the
              physical storage of this table.

              * **Columns** *(list) --*

                A list of the "Columns" in the table.

                * *(dict) --*

                  A column in a "Table".

                  * **Name** *(string) --*

                    The name of the "Column".

                  * **Type** *(string) --*

                    The data type of the "Column".

                  * **Comment** *(string) --*

                    A free-form text comment.

                  * **Parameters** *(dict) --*

                    These key-value pairs define properties associated
                    with the column.

                    * *(string) --*

                      * *(string) --*

              * **Location** *(string) --*

                The physical location of the table. By default, this
                takes the form of the warehouse location, followed by
                the database location in the warehouse, followed by
                the table name.

              * **AdditionalLocations** *(list) --*

                A list of locations that point to the path where a
                Delta table is located.

                * *(string) --*

              * **InputFormat** *(string) --*

                The input format: "SequenceFileInputFormat" (binary),
                or "TextInputFormat", or a custom format.

              * **OutputFormat** *(string) --*

                The output format: "SequenceFileOutputFormat"
                (binary), or "IgnoreKeyTextOutputFormat", or a custom
                format.

              * **Compressed** *(boolean) --*

                "True" if the data in the table is compressed, or
                "False" if not.

              * **NumberOfBuckets** *(integer) --*

                Must be specified if the table contains any dimension
                columns.

              * **SerdeInfo** *(dict) --*

                The serialization/deserialization (SerDe) information.

                * **Name** *(string) --*

                  Name of the SerDe.

                * **SerializationLibrary** *(string) --*

                  Usually the class that implements the SerDe. An
                  example is "org.apache.hadoop.hive.serde2.columnar.
                  ColumnarSerDe".

                * **Parameters** *(dict) --*

                  These key-value pairs define initialization
                  parameters for the SerDe.

                  * *(string) --*

                    * *(string) --*

              * **BucketColumns** *(list) --*

                A list of reducer grouping columns, clustering
                columns, and bucketing columns in the table.

                * *(string) --*

              * **SortColumns** *(list) --*

                A list specifying the sort order of each bucket in the
                table.

                * *(dict) --*

                  Specifies the sort order of a sorted column.

                  * **Column** *(string) --*

                    The name of the column.

                  * **SortOrder** *(integer) --*

                    Indicates that the column is sorted in ascending
                    order ( "== 1"), or in descending order ( "==0").

              * **Parameters** *(dict) --*

                The user-supplied properties in key-value form.

                * *(string) --*

                  * *(string) --*

              * **SkewedInfo** *(dict) --*

                The information about values that appear frequently in
                a column (skewed values).

                * **SkewedColumnNames** *(list) --*

                  A list of names of columns that contain skewed
                  values.

                  * *(string) --*

                * **SkewedColumnValues** *(list) --*

                  A list of values that appear so frequently as to be
                  considered skewed.

                  * *(string) --*

                * **SkewedColumnValueLocationMaps** *(dict) --*

                  A mapping of skewed values to the columns that
                  contain them.

                  * *(string) --*

                    * *(string) --*

              * **StoredAsSubDirectories** *(boolean) --*

                "True" if the table data is stored in subdirectories,
                or "False" if not.

              * **SchemaReference** *(dict) --*

                An object that references a schema stored in the Glue
                Schema Registry.

                When creating a table, you can pass an empty list of
                columns for the schema, and instead use a schema
                reference.

                * **SchemaId** *(dict) --*

                  A structure that contains schema identity fields.
                  Either this or the "SchemaVersionId" has to be
                  provided.

                  * **SchemaArn** *(string) --*

                    The Amazon Resource Name (ARN) of the schema. One
                    of "SchemaArn" or "SchemaName" has to be provided.

                  * **SchemaName** *(string) --*

                    The name of the schema. One of "SchemaArn" or
                    "SchemaName" has to be provided.

                  * **RegistryName** *(string) --*

                    The name of the schema registry that contains the
                    schema.

                * **SchemaVersionId** *(string) --*

                  The unique ID assigned to a version of the schema.
                  Either this or the "SchemaId" has to be provided.

                * **SchemaVersionNumber** *(integer) --*

                  The version number of the schema.

            * **PartitionKeys** *(list) --*

              A list of columns by which the table is partitioned.
              Only primitive types are supported as partition keys.

              When you create a table used by Amazon Athena, and you
              do not specify any "partitionKeys", you must at least
              set the value of "partitionKeys" to an empty list. For
              example:

              ""PartitionKeys": []"

              * *(dict) --*

                A column in a "Table".

                * **Name** *(string) --*

                  The name of the "Column".

                * **Type** *(string) --*

                  The data type of the "Column".

                * **Comment** *(string) --*

                  A free-form text comment.

                * **Parameters** *(dict) --*

                  These key-value pairs define properties associated
                  with the column.

                  * *(string) --*

                    * *(string) --*

            * **ViewOriginalText** *(string) --*

              Included for Apache Hive compatibility. Not used in the
              normal course of Glue operations. If the table is a
              "VIRTUAL_VIEW", certain Athena configuration encoded in
              base64.

            * **ViewExpandedText** *(string) --*

              Included for Apache Hive compatibility. Not used in the
              normal course of Glue operations.

            * **TableType** *(string) --*

              The type of this table. Glue will create tables with the
              "EXTERNAL_TABLE" type. Other services, such as Athena,
              may create tables with additional table types.

              Glue related table types:

                 EXTERNAL_TABLE

              Hive compatible attribute - indicates a non-Hive managed
              table.

                 GOVERNED

              Used by Lake Formation. The Glue Data Catalog
              understands "GOVERNED".

            * **Parameters** *(dict) --*

              These key-value pairs define properties associated with
              the table.

              * *(string) --*

                * *(string) --*

            * **CreatedBy** *(string) --*

              The person or entity who created the table.

            * **IsRegisteredWithLakeFormation** *(boolean) --*

              Indicates whether the table has been registered with
              Lake Formation.

            * **TargetTable** *(dict) --*

              A "TableIdentifier" structure that describes a target
              table for resource linking.

              * **CatalogId** *(string) --*

                The ID of the Data Catalog in which the table resides.

              * **DatabaseName** *(string) --*

                The name of the catalog database that contains the
                target table.

              * **Name** *(string) --*

                The name of the target table.

              * **Region** *(string) --*

                Region of the target table.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the table resides.

            * **VersionId** *(string) --*

              The ID of the table version.

            * **FederatedTable** *(dict) --*

              A "FederatedTable" structure that references an entity
              outside the Glue Data Catalog.

              * **Identifier** *(string) --*

                A unique identifier for the federated table.

              * **DatabaseIdentifier** *(string) --*

                A unique identifier for the federated database.

              * **ConnectionName** *(string) --*

                The name of the connection to the external metastore.

              * **ConnectionType** *(string) --*

                The type of connection used to access the federated
                table, specifying the protocol or method for
                connecting to the external data source.

            * **ViewDefinition** *(dict) --*

              A structure that contains all the information that
              defines the view, including the dialect or dialects for
              the view, and the query.

              * **IsProtected** *(boolean) --*

                You can set this flag as true to instruct the engine
                not to push user-provided operations into the logical
                plan of the view during query planning. However,
                setting this flag does not guarantee that the engine
                will comply. Refer to the engine's documentation to
                understand the guarantees provided, if any.

              * **Definer** *(string) --*

                The definer of a view in SQL.

              * **SubObjects** *(list) --*

                A list of table Amazon Resource Names (ARNs).

                * *(string) --*

              * **Representations** *(list) --*

                A list of representations.

                * *(dict) --*

                  A structure that contains the dialect of the view,
                  and the query that defines the view.

                  * **Dialect** *(string) --*

                    The dialect of the query engine.

                  * **DialectVersion** *(string) --*

                    The version of the dialect of the query engine.
                    For example, 3.0.0.

                  * **ViewOriginalText** *(string) --*

                    The "SELECT" query provided by the customer during
                    "CREATE VIEW DDL". This SQL is not used during a
                    query on a view ( "ViewExpandedText" is used
                    instead). "ViewOriginalText" is used for cases
                    like "SHOW CREATE VIEW" where users want to see
                    the original DDL command that created the view.

                  * **ViewExpandedText** *(string) --*

                    The expanded SQL for the view. This SQL is used by
                    engines while processing a query on a view.
                    Engines may perform operations during view
                    creation to transform "ViewOriginalText" to
                    "ViewExpandedText". For example:

                    * Fully qualified identifiers: "SELECT * from
                      table1 -> SELECT * from db1.table1"

                  * **ValidationConnection** *(string) --*

                    The name of the connection to be used to validate
                    the specific representation of the view.

                  * **IsStale** *(boolean) --*

                    Dialects marked as stale are no longer valid and
                    must be updated before they can be queried in
                    their respective query engines.

            * **IsMultiDialectView** *(boolean) --*

              Specifies whether the view supports the SQL dialects of
              one or more different query engines and can therefore be
              read by those engines.

            * **Status** *(dict) --*

              A structure containing information about the state of an
              asynchronous change to a table.

              * **RequestedBy** *(string) --*

                The ARN of the user who requested the asynchronous
                change.

              * **UpdatedBy** *(string) --*

                The ARN of the user to last manually alter the
                asynchronous change (requesting cancellation, etc).

              * **RequestTime** *(datetime) --*

                An ISO 8601 formatted date string indicating the time
                that the change was initiated.

              * **UpdateTime** *(datetime) --*

                An ISO 8601 formatted date string indicating the time
                that the state was last updated.

              * **Action** *(string) --*

                Indicates which action was called on the table,
                currently only "CREATE" or "UPDATE".

              * **State** *(string) --*

                A generic status for the change in progress, such as
                QUEUED, IN_PROGRESS, SUCCESS, or FAILED.

              * **Error** *(dict) --*

                An error that will only appear when the state is
                "FAILED". This is a parent level exception message,
                there may be different >>``<<Error``s for each
                dialect.

                * **ErrorCode** *(string) --*

                  The code associated with this error.

                * **ErrorMessage** *(string) --*

                  A message describing the error.

              * **Details** *(dict) --*

                A "StatusDetails" object with information about the
                requested change.

                * **RequestedChange** *(dict) --*

                  A "Table" object representing the requested changes.

                * **ViewValidations** *(list) --*

                  A list of "ViewValidation" objects that contain
                  information for an analytical engine to validate a
                  view.

                  * *(dict) --*

                    A structure that contains information for an
                    analytical engine to validate a view, prior to
                    persisting the view metadata. Used in the case of
                    direct "UpdateTable" or "CreateTable" API calls.

                    * **Dialect** *(string) --*

                      The dialect of the query engine.

                    * **DialectVersion** *(string) --*

                      The version of the dialect of the query engine.
                      For example, 3.0.0.

                    * **ViewValidationText** *(string) --*

                      The "SELECT" query that defines the view, as
                      provided by the customer.

                    * **UpdateTime** *(datetime) --*

                      The time of the last update.

                    * **State** *(string) --*

                      The state of the validation.

                    * **Error** *(dict) --*

                      An error associated with the validation.

                      * **ErrorCode** *(string) --*

                        The code associated with this error.

                      * **ErrorMessage** *(string) --*

                        A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / update_user_defined_function


update_user_defined_function
****************************

Glue.Client.update_user_defined_function(**kwargs)

   Updates an existing function definition in the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_user_defined_function(
          CatalogId='string',
          DatabaseName='string',
          FunctionName='string',
          FunctionInput={
              'FunctionName': 'string',
              'ClassName': 'string',
              'OwnerName': 'string',
              'OwnerType': 'USER'|'ROLE'|'GROUP',
              'ResourceUris': [
                  {
                      'ResourceType': 'JAR'|'FILE'|'ARCHIVE',
                      'Uri': 'string'
                  },
              ]
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the function to be updated is located. If none is provided,
        the Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the function to be
        updated is located.

      * **FunctionName** (*string*) --

        **[REQUIRED]**

        The name of the function.

      * **FunctionInput** (*dict*) --

        **[REQUIRED]**

        A "FunctionInput" object that redefines the function in the
        Data Catalog.

        * **FunctionName** *(string) --*

          The name of the function.

        * **ClassName** *(string) --*

          The Java class that contains the function code.

        * **OwnerName** *(string) --*

          The owner of the function.

        * **OwnerType** *(string) --*

          The owner type.

        * **ResourceUris** *(list) --*

          The resource URIs for the function.

          * *(dict) --*

            The URIs for function resources.

            * **ResourceType** *(string) --*

              The type of the resource.

            * **Uri** *(string) --*

              The URI for accessing the resource.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_schema_version


get_schema_version
******************

Glue.Client.get_schema_version(**kwargs)

   Get the specified schema by its unique ID assigned when a version
   of the schema is created or registered. Schema versions in Deleted
   status will not be included in the results.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_schema_version(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          SchemaVersionId='string',
          SchemaVersionNumber={
              'LatestVersion': True|False,
              'VersionNumber': 123
          }
      )

   Parameters:
      * **SchemaId** (*dict*) --

        This is a wrapper structure to contain schema identity fields.
        The structure contains:

        * SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the
          schema. Either "SchemaArn" or "SchemaName" and
          "RegistryName" has to be provided.

        * SchemaId$SchemaName: The name of the schema. Either
          "SchemaArn" or "SchemaName" and "RegistryName" has to be
          provided.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **SchemaVersionId** (*string*) -- The "SchemaVersionId" of the
        schema version. This field is required for fetching by schema
        ID. Either this or the "SchemaId" wrapper has to be provided.

      * **SchemaVersionNumber** (*dict*) --

        The version number of the schema.

        * **LatestVersion** *(boolean) --*

          The latest version available for the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SchemaVersionId': 'string',
             'SchemaDefinition': 'string',
             'DataFormat': 'AVRO'|'JSON'|'PROTOBUF',
             'SchemaArn': 'string',
             'VersionNumber': 123,
             'Status': 'AVAILABLE'|'PENDING'|'FAILURE'|'DELETING',
             'CreatedTime': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **SchemaVersionId** *(string) --*

          The "SchemaVersionId" of the schema version.

        * **SchemaDefinition** *(string) --*

          The schema definition for the schema ID.

        * **DataFormat** *(string) --*

          The data format of the schema definition. Currently "AVRO",
          "JSON" and "PROTOBUF" are supported.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

        * **Status** *(string) --*

          The status of the schema version.

        * **CreatedTime** *(string) --*

          The date and time the schema version was created.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / list_data_quality_rulesets


list_data_quality_rulesets
**************************

Glue.Client.list_data_quality_rulesets(**kwargs)

   Returns a paginated list of rulesets for the specified list of Glue
   tables.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_data_quality_rulesets(
          NextToken='string',
          MaxResults=123,
          Filter={
              'Name': 'string',
              'Description': 'string',
              'CreatedBefore': datetime(2015, 1, 1),
              'CreatedAfter': datetime(2015, 1, 1),
              'LastModifiedBefore': datetime(2015, 1, 1),
              'LastModifiedAfter': datetime(2015, 1, 1),
              'TargetTable': {
                  'TableName': 'string',
                  'DatabaseName': 'string',
                  'CatalogId': 'string'
              }
          },
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **NextToken** (*string*) -- A paginated token to offset the
        results.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return.

      * **Filter** (*dict*) --

        The filter criteria.

        * **Name** *(string) --*

          The name of the ruleset filter criteria.

        * **Description** *(string) --*

          The description of the ruleset filter criteria.

        * **CreatedBefore** *(datetime) --*

          Filter on rulesets created before this date.

        * **CreatedAfter** *(datetime) --*

          Filter on rulesets created after this date.

        * **LastModifiedBefore** *(datetime) --*

          Filter on rulesets last modified before this date.

        * **LastModifiedAfter** *(datetime) --*

          Filter on rulesets last modified after this date.

        * **TargetTable** *(dict) --*

          The name and database name of the target table.

          * **TableName** *(string) --* **[REQUIRED]**

            The name of the Glue table.

          * **DatabaseName** *(string) --* **[REQUIRED]**

            The name of the database where the Glue table exists.

          * **CatalogId** *(string) --*

            The catalog id where the Glue table exists.

      * **Tags** (*dict*) --

        A list of key-value pair tags.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Rulesets': [
                 {
                     'Name': 'string',
                     'Description': 'string',
                     'CreatedOn': datetime(2015, 1, 1),
                     'LastModifiedOn': datetime(2015, 1, 1),
                     'TargetTable': {
                         'TableName': 'string',
                         'DatabaseName': 'string',
                         'CatalogId': 'string'
                     },
                     'RecommendationRunId': 'string',
                     'RuleCount': 123
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Rulesets** *(list) --*

          A paginated list of rulesets for the specified list of Glue
          tables.

          * *(dict) --*

            Describes a data quality ruleset returned by
            "GetDataQualityRuleset".

            * **Name** *(string) --*

              The name of the data quality ruleset.

            * **Description** *(string) --*

              A description of the data quality ruleset.

            * **CreatedOn** *(datetime) --*

              The date and time the data quality ruleset was created.

            * **LastModifiedOn** *(datetime) --*

              The date and time the data quality ruleset was last
              modified.

            * **TargetTable** *(dict) --*

              An object representing an Glue table.

              * **TableName** *(string) --*

                The name of the Glue table.

              * **DatabaseName** *(string) --*

                The name of the database where the Glue table exists.

              * **CatalogId** *(string) --*

                The catalog id where the Glue table exists.

            * **RecommendationRunId** *(string) --*

              When a ruleset was created from a recommendation run,
              this run ID is generated to link the two together.

            * **RuleCount** *(integer) --*

              The number of rules in the ruleset.

        * **NextToken** *(string) --*

          A pagination token, if more results are available.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / delete_integration_table_properties


delete_integration_table_properties
***********************************

Glue.Client.delete_integration_table_properties(**kwargs)

   Deletes the table properties that have been created for the tables
   that need to be replicated.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_integration_table_properties(
          ResourceArn='string',
          TableName='string'
      )

   Parameters:
      * **ResourceArn** (*string*) --

        **[REQUIRED]**

        The connection ARN of the source, or the database ARN of the
        target.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table to be replicated.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ResourceNotFoundException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / get_dev_endpoint


get_dev_endpoint
****************

Glue.Client.get_dev_endpoint(**kwargs)

   Retrieves information about a specified development endpoint.

   Note:

     When you create a development endpoint in a virtual private cloud
     (VPC), Glue returns only a private IP address, and the public IP
     address field is not populated. When you create a non-VPC
     development endpoint, Glue returns only a public IP address.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_dev_endpoint(
          EndpointName='string'
      )

   Parameters:
      **EndpointName** (*string*) --

      **[REQUIRED]**

      Name of the "DevEndpoint" to retrieve information for.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'DevEndpoint': {
                 'EndpointName': 'string',
                 'RoleArn': 'string',
                 'SecurityGroupIds': [
                     'string',
                 ],
                 'SubnetId': 'string',
                 'YarnEndpointAddress': 'string',
                 'PrivateAddress': 'string',
                 'ZeppelinRemoteSparkInterpreterPort': 123,
                 'PublicAddress': 'string',
                 'Status': 'string',
                 'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                 'GlueVersion': 'string',
                 'NumberOfWorkers': 123,
                 'NumberOfNodes': 123,
                 'AvailabilityZone': 'string',
                 'VpcId': 'string',
                 'ExtraPythonLibsS3Path': 'string',
                 'ExtraJarsS3Path': 'string',
                 'FailureReason': 'string',
                 'LastUpdateStatus': 'string',
                 'CreatedTimestamp': datetime(2015, 1, 1),
                 'LastModifiedTimestamp': datetime(2015, 1, 1),
                 'PublicKey': 'string',
                 'PublicKeys': [
                     'string',
                 ],
                 'SecurityConfiguration': 'string',
                 'Arguments': {
                     'string': 'string'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **DevEndpoint** *(dict) --*

          A "DevEndpoint" definition.

          * **EndpointName** *(string) --*

            The name of the "DevEndpoint".

          * **RoleArn** *(string) --*

            The Amazon Resource Name (ARN) of the IAM role used in
            this "DevEndpoint".

          * **SecurityGroupIds** *(list) --*

            A list of security group identifiers used in this
            "DevEndpoint".

            * *(string) --*

          * **SubnetId** *(string) --*

            The subnet ID for this "DevEndpoint".

          * **YarnEndpointAddress** *(string) --*

            The YARN endpoint address used by this "DevEndpoint".

          * **PrivateAddress** *(string) --*

            A private IP address to access the "DevEndpoint" within a
            VPC if the "DevEndpoint" is created within one. The
            "PrivateAddress" field is present only when you create the
            "DevEndpoint" within your VPC.

          * **ZeppelinRemoteSparkInterpreterPort** *(integer) --*

            The Apache Zeppelin port for the remote Apache Spark
            interpreter.

          * **PublicAddress** *(string) --*

            The public IP address used by this "DevEndpoint". The
            "PublicAddress" field is present only when you create a
            non-virtual private cloud (VPC) "DevEndpoint".

          * **Status** *(string) --*

            The current status of this "DevEndpoint".

          * **WorkerType** *(string) --*

            The type of predefined worker that is allocated to the
            development endpoint. Accepts a value of Standard, G.1X,
            or G.2X.

            * For the "Standard" worker type, each worker provides 4
              vCPU, 16 GB of memory and a 50GB disk, and 2 executors
              per worker.

            * For the "G.1X" worker type, each worker maps to 1 DPU (4
              vCPU, 16 GB of memory, 64 GB disk), and provides 1
              executor per worker. We recommend this worker type for
              memory-intensive jobs.

            * For the "G.2X" worker type, each worker maps to 2 DPU (8
              vCPU, 32 GB of memory, 128 GB disk), and provides 1
              executor per worker. We recommend this worker type for
              memory-intensive jobs.

            Known issue: when a development endpoint is created with
            the "G.2X" "WorkerType" configuration, the Spark drivers
            for the development endpoint will run on 4 vCPU, 16 GB of
            memory, and a 64 GB disk.

          * **GlueVersion** *(string) --*

            Glue version determines the versions of Apache Spark and
            Python that Glue supports. The Python version indicates
            the version supported for running your ETL scripts on
            development endpoints.

            For more information about the available Glue versions and
            corresponding Spark and Python versions, see Glue version
            in the developer guide.

            Development endpoints that are created without specifying
            a Glue version default to Glue 0.9.

            You can specify a version of Python support for
            development endpoints by using the "Arguments" parameter
            in the "CreateDevEndpoint" or "UpdateDevEndpoint" APIs. If
            no arguments are provided, the version defaults to Python
            2.

          * **NumberOfWorkers** *(integer) --*

            The number of workers of a defined "workerType" that are
            allocated to the development endpoint.

            The maximum number of workers you can define are 299 for
            "G.1X", and 149 for "G.2X".

          * **NumberOfNodes** *(integer) --*

            The number of Glue Data Processing Units (DPUs) allocated
            to this "DevEndpoint".

          * **AvailabilityZone** *(string) --*

            The Amazon Web Services Availability Zone where this
            "DevEndpoint" is located.

          * **VpcId** *(string) --*

            The ID of the virtual private cloud (VPC) used by this
            "DevEndpoint".

          * **ExtraPythonLibsS3Path** *(string) --*

            The paths to one or more Python libraries in an Amazon S3
            bucket that should be loaded in your "DevEndpoint".
            Multiple values must be complete paths separated by a
            comma.

            Note:

              You can only use pure Python libraries with a
              "DevEndpoint". Libraries that rely on C extensions, such
              as the pandas Python data analysis library, are not
              currently supported.

          * **ExtraJarsS3Path** *(string) --*

            The path to one or more Java ".jar" files in an S3 bucket
            that should be loaded in your "DevEndpoint".

            Note:

              You can only use pure Java/Scala libraries with a
              "DevEndpoint".

          * **FailureReason** *(string) --*

            The reason for a current failure in this "DevEndpoint".

          * **LastUpdateStatus** *(string) --*

            The status of the last update.

          * **CreatedTimestamp** *(datetime) --*

            The point in time at which this DevEndpoint was created.

          * **LastModifiedTimestamp** *(datetime) --*

            The point in time at which this "DevEndpoint" was last
            modified.

          * **PublicKey** *(string) --*

            The public key to be used by this "DevEndpoint" for
            authentication. This attribute is provided for backward
            compatibility because the recommended attribute to use is
            public keys.

          * **PublicKeys** *(list) --*

            A list of public keys to be used by the "DevEndpoints" for
            authentication. Using this attribute is preferred over a
            single public key because the public keys allow you to
            have a different private key per client.

            Note:

              If you previously created an endpoint with a public key,
              you must remove that key to be able to set a list of
              public keys. Call the "UpdateDevEndpoint" API operation
              with the public key content in the "deletePublicKeys"
              attribute, and the list of new keys in the
              "addPublicKeys" attribute.

            * *(string) --*

          * **SecurityConfiguration** *(string) --*

            The name of the "SecurityConfiguration" structure to be
            used with this "DevEndpoint".

          * **Arguments** *(dict) --*

            A map of arguments used to configure the "DevEndpoint".

            Valid arguments are:

            * ""--enable-glue-datacatalog": """

            You can specify a version of Python support for
            development endpoints by using the "Arguments" parameter
            in the "CreateDevEndpoint" or "UpdateDevEndpoint" APIs. If
            no arguments are provided, the version defaults to Python
            2.

            * *(string) --*

              * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / get_tables


get_tables
**********

Glue.Client.get_tables(**kwargs)

   Retrieves the definitions of some or all of the tables in a given
   "Database".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_tables(
          CatalogId='string',
          DatabaseName='string',
          Expression='string',
          NextToken='string',
          MaxResults=123,
          TransactionId='string',
          QueryAsOfTime=datetime(2015, 1, 1),
          IncludeStatusDetails=True|False,
          AttributesToGet=[
              'NAME'|'TABLE_TYPE',
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the tables reside. If none is provided, the Amazon Web
        Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The database in the catalog whose tables to list. For Hive
        compatibility, this name is entirely lowercase.

      * **Expression** (*string*) -- A regular expression pattern. If
        present, only those tables whose names match the pattern are
        returned.

      * **NextToken** (*string*) -- A continuation token, included if
        this is a continuation call.

      * **MaxResults** (*integer*) -- The maximum number of tables to
        return in a single response.

      * **TransactionId** (*string*) -- The transaction ID at which to
        read the table contents.

      * **QueryAsOfTime** (*datetime*) -- The time as of when to read
        the table contents. If not set, the most recent transaction
        commit time will be used. Cannot be specified along with
        "TransactionId".

      * **IncludeStatusDetails** (*boolean*) -- Specifies whether to
        include status details related to a request to create or
        update an Glue Data Catalog view.

      * **AttributesToGet** (*list*) --

        Specifies the table fields returned by the "GetTables" call.
        This parameter doesn’t accept an empty list. The request must
        include "NAME".

        The following are the valid combinations of values:

        * "NAME" - Names of all tables in the database.

        * "NAME", "TABLE_TYPE" - Names of all tables and the table
          types.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TableList': [
                 {
                     'Name': 'string',
                     'DatabaseName': 'string',
                     'Description': 'string',
                     'Owner': 'string',
                     'CreateTime': datetime(2015, 1, 1),
                     'UpdateTime': datetime(2015, 1, 1),
                     'LastAccessTime': datetime(2015, 1, 1),
                     'LastAnalyzedTime': datetime(2015, 1, 1),
                     'Retention': 123,
                     'StorageDescriptor': {
                         'Columns': [
                             {
                                 'Name': 'string',
                                 'Type': 'string',
                                 'Comment': 'string',
                                 'Parameters': {
                                     'string': 'string'
                                 }
                             },
                         ],
                         'Location': 'string',
                         'AdditionalLocations': [
                             'string',
                         ],
                         'InputFormat': 'string',
                         'OutputFormat': 'string',
                         'Compressed': True|False,
                         'NumberOfBuckets': 123,
                         'SerdeInfo': {
                             'Name': 'string',
                             'SerializationLibrary': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                         'BucketColumns': [
                             'string',
                         ],
                         'SortColumns': [
                             {
                                 'Column': 'string',
                                 'SortOrder': 123
                             },
                         ],
                         'Parameters': {
                             'string': 'string'
                         },
                         'SkewedInfo': {
                             'SkewedColumnNames': [
                                 'string',
                             ],
                             'SkewedColumnValues': [
                                 'string',
                             ],
                             'SkewedColumnValueLocationMaps': {
                                 'string': 'string'
                             }
                         },
                         'StoredAsSubDirectories': True|False,
                         'SchemaReference': {
                             'SchemaId': {
                                 'SchemaArn': 'string',
                                 'SchemaName': 'string',
                                 'RegistryName': 'string'
                             },
                             'SchemaVersionId': 'string',
                             'SchemaVersionNumber': 123
                         }
                     },
                     'PartitionKeys': [
                         {
                             'Name': 'string',
                             'Type': 'string',
                             'Comment': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                     ],
                     'ViewOriginalText': 'string',
                     'ViewExpandedText': 'string',
                     'TableType': 'string',
                     'Parameters': {
                         'string': 'string'
                     },
                     'CreatedBy': 'string',
                     'IsRegisteredWithLakeFormation': True|False,
                     'TargetTable': {
                         'CatalogId': 'string',
                         'DatabaseName': 'string',
                         'Name': 'string',
                         'Region': 'string'
                     },
                     'CatalogId': 'string',
                     'VersionId': 'string',
                     'FederatedTable': {
                         'Identifier': 'string',
                         'DatabaseIdentifier': 'string',
                         'ConnectionName': 'string',
                         'ConnectionType': 'string'
                     },
                     'ViewDefinition': {
                         'IsProtected': True|False,
                         'Definer': 'string',
                         'SubObjects': [
                             'string',
                         ],
                         'Representations': [
                             {
                                 'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                 'DialectVersion': 'string',
                                 'ViewOriginalText': 'string',
                                 'ViewExpandedText': 'string',
                                 'ValidationConnection': 'string',
                                 'IsStale': True|False
                             },
                         ]
                     },
                     'IsMultiDialectView': True|False,
                     'Status': {
                         'RequestedBy': 'string',
                         'UpdatedBy': 'string',
                         'RequestTime': datetime(2015, 1, 1),
                         'UpdateTime': datetime(2015, 1, 1),
                         'Action': 'UPDATE'|'CREATE',
                         'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                         'Error': {
                             'ErrorCode': 'string',
                             'ErrorMessage': 'string'
                         },
                         'Details': {
                             'RequestedChange': {'... recursive ...'},
                             'ViewValidations': [
                                 {
                                     'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                     'DialectVersion': 'string',
                                     'ViewValidationText': 'string',
                                     'UpdateTime': datetime(2015, 1, 1),
                                     'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                                     'Error': {
                                         'ErrorCode': 'string',
                                         'ErrorMessage': 'string'
                                     }
                                 },
                             ]
                         }
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TableList** *(list) --*

          A list of the requested "Table" objects.

          * *(dict) --*

            Represents a collection of related data organized in
            columns and rows.

            * **Name** *(string) --*

              The table name. For Hive compatibility, this must be
              entirely lowercase.

            * **DatabaseName** *(string) --*

              The name of the database where the table metadata
              resides. For Hive compatibility, this must be all
              lowercase.

            * **Description** *(string) --*

              A description of the table.

            * **Owner** *(string) --*

              The owner of the table.

            * **CreateTime** *(datetime) --*

              The time when the table definition was created in the
              Data Catalog.

            * **UpdateTime** *(datetime) --*

              The last time that the table was updated.

            * **LastAccessTime** *(datetime) --*

              The last time that the table was accessed. This is
              usually taken from HDFS, and might not be reliable.

            * **LastAnalyzedTime** *(datetime) --*

              The last time that column statistics were computed for
              this table.

            * **Retention** *(integer) --*

              The retention time for this table.

            * **StorageDescriptor** *(dict) --*

              A storage descriptor containing information about the
              physical storage of this table.

              * **Columns** *(list) --*

                A list of the "Columns" in the table.

                * *(dict) --*

                  A column in a "Table".

                  * **Name** *(string) --*

                    The name of the "Column".

                  * **Type** *(string) --*

                    The data type of the "Column".

                  * **Comment** *(string) --*

                    A free-form text comment.

                  * **Parameters** *(dict) --*

                    These key-value pairs define properties associated
                    with the column.

                    * *(string) --*

                      * *(string) --*

              * **Location** *(string) --*

                The physical location of the table. By default, this
                takes the form of the warehouse location, followed by
                the database location in the warehouse, followed by
                the table name.

              * **AdditionalLocations** *(list) --*

                A list of locations that point to the path where a
                Delta table is located.

                * *(string) --*

              * **InputFormat** *(string) --*

                The input format: "SequenceFileInputFormat" (binary),
                or "TextInputFormat", or a custom format.

              * **OutputFormat** *(string) --*

                The output format: "SequenceFileOutputFormat"
                (binary), or "IgnoreKeyTextOutputFormat", or a custom
                format.

              * **Compressed** *(boolean) --*

                "True" if the data in the table is compressed, or
                "False" if not.

              * **NumberOfBuckets** *(integer) --*

                Must be specified if the table contains any dimension
                columns.

              * **SerdeInfo** *(dict) --*

                The serialization/deserialization (SerDe) information.

                * **Name** *(string) --*

                  Name of the SerDe.

                * **SerializationLibrary** *(string) --*

                  Usually the class that implements the SerDe. An
                  example is "org.apache.hadoop.hive.serde2.columnar.
                  ColumnarSerDe".

                * **Parameters** *(dict) --*

                  These key-value pairs define initialization
                  parameters for the SerDe.

                  * *(string) --*

                    * *(string) --*

              * **BucketColumns** *(list) --*

                A list of reducer grouping columns, clustering
                columns, and bucketing columns in the table.

                * *(string) --*

              * **SortColumns** *(list) --*

                A list specifying the sort order of each bucket in the
                table.

                * *(dict) --*

                  Specifies the sort order of a sorted column.

                  * **Column** *(string) --*

                    The name of the column.

                  * **SortOrder** *(integer) --*

                    Indicates that the column is sorted in ascending
                    order ( "== 1"), or in descending order ( "==0").

              * **Parameters** *(dict) --*

                The user-supplied properties in key-value form.

                * *(string) --*

                  * *(string) --*

              * **SkewedInfo** *(dict) --*

                The information about values that appear frequently in
                a column (skewed values).

                * **SkewedColumnNames** *(list) --*

                  A list of names of columns that contain skewed
                  values.

                  * *(string) --*

                * **SkewedColumnValues** *(list) --*

                  A list of values that appear so frequently as to be
                  considered skewed.

                  * *(string) --*

                * **SkewedColumnValueLocationMaps** *(dict) --*

                  A mapping of skewed values to the columns that
                  contain them.

                  * *(string) --*

                    * *(string) --*

              * **StoredAsSubDirectories** *(boolean) --*

                "True" if the table data is stored in subdirectories,
                or "False" if not.

              * **SchemaReference** *(dict) --*

                An object that references a schema stored in the Glue
                Schema Registry.

                When creating a table, you can pass an empty list of
                columns for the schema, and instead use a schema
                reference.

                * **SchemaId** *(dict) --*

                  A structure that contains schema identity fields.
                  Either this or the "SchemaVersionId" has to be
                  provided.

                  * **SchemaArn** *(string) --*

                    The Amazon Resource Name (ARN) of the schema. One
                    of "SchemaArn" or "SchemaName" has to be provided.

                  * **SchemaName** *(string) --*

                    The name of the schema. One of "SchemaArn" or
                    "SchemaName" has to be provided.

                  * **RegistryName** *(string) --*

                    The name of the schema registry that contains the
                    schema.

                * **SchemaVersionId** *(string) --*

                  The unique ID assigned to a version of the schema.
                  Either this or the "SchemaId" has to be provided.

                * **SchemaVersionNumber** *(integer) --*

                  The version number of the schema.

            * **PartitionKeys** *(list) --*

              A list of columns by which the table is partitioned.
              Only primitive types are supported as partition keys.

              When you create a table used by Amazon Athena, and you
              do not specify any "partitionKeys", you must at least
              set the value of "partitionKeys" to an empty list. For
              example:

              ""PartitionKeys": []"

              * *(dict) --*

                A column in a "Table".

                * **Name** *(string) --*

                  The name of the "Column".

                * **Type** *(string) --*

                  The data type of the "Column".

                * **Comment** *(string) --*

                  A free-form text comment.

                * **Parameters** *(dict) --*

                  These key-value pairs define properties associated
                  with the column.

                  * *(string) --*

                    * *(string) --*

            * **ViewOriginalText** *(string) --*

              Included for Apache Hive compatibility. Not used in the
              normal course of Glue operations. If the table is a
              "VIRTUAL_VIEW", certain Athena configuration encoded in
              base64.

            * **ViewExpandedText** *(string) --*

              Included for Apache Hive compatibility. Not used in the
              normal course of Glue operations.

            * **TableType** *(string) --*

              The type of this table. Glue will create tables with the
              "EXTERNAL_TABLE" type. Other services, such as Athena,
              may create tables with additional table types.

              Glue related table types:

                 EXTERNAL_TABLE

              Hive compatible attribute - indicates a non-Hive managed
              table.

                 GOVERNED

              Used by Lake Formation. The Glue Data Catalog
              understands "GOVERNED".

            * **Parameters** *(dict) --*

              These key-value pairs define properties associated with
              the table.

              * *(string) --*

                * *(string) --*

            * **CreatedBy** *(string) --*

              The person or entity who created the table.

            * **IsRegisteredWithLakeFormation** *(boolean) --*

              Indicates whether the table has been registered with
              Lake Formation.

            * **TargetTable** *(dict) --*

              A "TableIdentifier" structure that describes a target
              table for resource linking.

              * **CatalogId** *(string) --*

                The ID of the Data Catalog in which the table resides.

              * **DatabaseName** *(string) --*

                The name of the catalog database that contains the
                target table.

              * **Name** *(string) --*

                The name of the target table.

              * **Region** *(string) --*

                Region of the target table.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the table resides.

            * **VersionId** *(string) --*

              The ID of the table version.

            * **FederatedTable** *(dict) --*

              A "FederatedTable" structure that references an entity
              outside the Glue Data Catalog.

              * **Identifier** *(string) --*

                A unique identifier for the federated table.

              * **DatabaseIdentifier** *(string) --*

                A unique identifier for the federated database.

              * **ConnectionName** *(string) --*

                The name of the connection to the external metastore.

              * **ConnectionType** *(string) --*

                The type of connection used to access the federated
                table, specifying the protocol or method for
                connecting to the external data source.

            * **ViewDefinition** *(dict) --*

              A structure that contains all the information that
              defines the view, including the dialect or dialects for
              the view, and the query.

              * **IsProtected** *(boolean) --*

                You can set this flag as true to instruct the engine
                not to push user-provided operations into the logical
                plan of the view during query planning. However,
                setting this flag does not guarantee that the engine
                will comply. Refer to the engine's documentation to
                understand the guarantees provided, if any.

              * **Definer** *(string) --*

                The definer of a view in SQL.

              * **SubObjects** *(list) --*

                A list of table Amazon Resource Names (ARNs).

                * *(string) --*

              * **Representations** *(list) --*

                A list of representations.

                * *(dict) --*

                  A structure that contains the dialect of the view,
                  and the query that defines the view.

                  * **Dialect** *(string) --*

                    The dialect of the query engine.

                  * **DialectVersion** *(string) --*

                    The version of the dialect of the query engine.
                    For example, 3.0.0.

                  * **ViewOriginalText** *(string) --*

                    The "SELECT" query provided by the customer during
                    "CREATE VIEW DDL". This SQL is not used during a
                    query on a view ( "ViewExpandedText" is used
                    instead). "ViewOriginalText" is used for cases
                    like "SHOW CREATE VIEW" where users want to see
                    the original DDL command that created the view.

                  * **ViewExpandedText** *(string) --*

                    The expanded SQL for the view. This SQL is used by
                    engines while processing a query on a view.
                    Engines may perform operations during view
                    creation to transform "ViewOriginalText" to
                    "ViewExpandedText". For example:

                    * Fully qualified identifiers: "SELECT * from
                      table1 -> SELECT * from db1.table1"

                  * **ValidationConnection** *(string) --*

                    The name of the connection to be used to validate
                    the specific representation of the view.

                  * **IsStale** *(boolean) --*

                    Dialects marked as stale are no longer valid and
                    must be updated before they can be queried in
                    their respective query engines.

            * **IsMultiDialectView** *(boolean) --*

              Specifies whether the view supports the SQL dialects of
              one or more different query engines and can therefore be
              read by those engines.

            * **Status** *(dict) --*

              A structure containing information about the state of an
              asynchronous change to a table.

              * **RequestedBy** *(string) --*

                The ARN of the user who requested the asynchronous
                change.

              * **UpdatedBy** *(string) --*

                The ARN of the user to last manually alter the
                asynchronous change (requesting cancellation, etc).

              * **RequestTime** *(datetime) --*

                An ISO 8601 formatted date string indicating the time
                that the change was initiated.

              * **UpdateTime** *(datetime) --*

                An ISO 8601 formatted date string indicating the time
                that the state was last updated.

              * **Action** *(string) --*

                Indicates which action was called on the table,
                currently only "CREATE" or "UPDATE".

              * **State** *(string) --*

                A generic status for the change in progress, such as
                QUEUED, IN_PROGRESS, SUCCESS, or FAILED.

              * **Error** *(dict) --*

                An error that will only appear when the state is
                "FAILED". This is a parent level exception message,
                there may be different >>``<<Error``s for each
                dialect.

                * **ErrorCode** *(string) --*

                  The code associated with this error.

                * **ErrorMessage** *(string) --*

                  A message describing the error.

              * **Details** *(dict) --*

                A "StatusDetails" object with information about the
                requested change.

                * **RequestedChange** *(dict) --*

                  A "Table" object representing the requested changes.

                * **ViewValidations** *(list) --*

                  A list of "ViewValidation" objects that contain
                  information for an analytical engine to validate a
                  view.

                  * *(dict) --*

                    A structure that contains information for an
                    analytical engine to validate a view, prior to
                    persisting the view metadata. Used in the case of
                    direct "UpdateTable" or "CreateTable" API calls.

                    * **Dialect** *(string) --*

                      The dialect of the query engine.

                    * **DialectVersion** *(string) --*

                      The version of the dialect of the query engine.
                      For example, 3.0.0.

                    * **ViewValidationText** *(string) --*

                      The "SELECT" query that defines the view, as
                      provided by the customer.

                    * **UpdateTime** *(datetime) --*

                      The time of the last update.

                    * **State** *(string) --*

                      The state of the validation.

                    * **Error** *(dict) --*

                      An error associated with the validation.

                      * **ErrorCode** *(string) --*

                        The code associated with this error.

                      * **ErrorMessage** *(string) --*

                        A message describing the error.

        * **NextToken** *(string) --*

          A continuation token, present if the current list segment is
          not the last.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / batch_get_table_optimizer


batch_get_table_optimizer
*************************

Glue.Client.batch_get_table_optimizer(**kwargs)

   Returns the configuration for the specified table optimizers.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_table_optimizer(
          Entries=[
              {
                  'catalogId': 'string',
                  'databaseName': 'string',
                  'tableName': 'string',
                  'type': 'compaction'|'retention'|'orphan_file_deletion'
              },
          ]
      )

   Parameters:
      **Entries** (*list*) --

      **[REQUIRED]**

      A list of "BatchGetTableOptimizerEntry" objects specifying the
      table optimizers to retrieve.

      * *(dict) --*

        Represents a table optimizer to retrieve in the
        "BatchGetTableOptimizer" operation.

        * **catalogId** *(string) --*

          The Catalog ID of the table.

        * **databaseName** *(string) --*

          The name of the database in the catalog in which the table
          resides.

        * **tableName** *(string) --*

          The name of the table.

        * **type** *(string) --*

          The type of table optimizer.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TableOptimizers': [
                 {
                     'catalogId': 'string',
                     'databaseName': 'string',
                     'tableName': 'string',
                     'tableOptimizer': {
                         'type': 'compaction'|'retention'|'orphan_file_deletion',
                         'configuration': {
                             'roleArn': 'string',
                             'enabled': True|False,
                             'vpcConfiguration': {
                                 'glueConnectionName': 'string'
                             },
                             'compactionConfiguration': {
                                 'icebergConfiguration': {
                                     'strategy': 'binpack'|'sort'|'z-order',
                                     'minInputFiles': 123,
                                     'deleteFileThreshold': 123
                                 }
                             },
                             'retentionConfiguration': {
                                 'icebergConfiguration': {
                                     'snapshotRetentionPeriodInDays': 123,
                                     'numberOfSnapshotsToRetain': 123,
                                     'cleanExpiredFiles': True|False,
                                     'runRateInHours': 123
                                 }
                             },
                             'orphanFileDeletionConfiguration': {
                                 'icebergConfiguration': {
                                     'orphanFileRetentionPeriodInDays': 123,
                                     'location': 'string',
                                     'runRateInHours': 123
                                 }
                             }
                         },
                         'lastRun': {
                             'eventType': 'starting'|'completed'|'failed'|'in_progress',
                             'startTimestamp': datetime(2015, 1, 1),
                             'endTimestamp': datetime(2015, 1, 1),
                             'metrics': {
                                 'NumberOfBytesCompacted': 'string',
                                 'NumberOfFilesCompacted': 'string',
                                 'NumberOfDpus': 'string',
                                 'JobDurationInHour': 'string'
                             },
                             'error': 'string',
                             'compactionMetrics': {
                                 'IcebergMetrics': {
                                     'NumberOfBytesCompacted': 123,
                                     'NumberOfFilesCompacted': 123,
                                     'DpuHours': 123.0,
                                     'NumberOfDpus': 123,
                                     'JobDurationInHour': 123.0
                                 }
                             },
                             'compactionStrategy': 'binpack'|'sort'|'z-order',
                             'retentionMetrics': {
                                 'IcebergMetrics': {
                                     'NumberOfDataFilesDeleted': 123,
                                     'NumberOfManifestFilesDeleted': 123,
                                     'NumberOfManifestListsDeleted': 123,
                                     'DpuHours': 123.0,
                                     'NumberOfDpus': 123,
                                     'JobDurationInHour': 123.0
                                 }
                             },
                             'orphanFileDeletionMetrics': {
                                 'IcebergMetrics': {
                                     'NumberOfOrphanFilesDeleted': 123,
                                     'DpuHours': 123.0,
                                     'NumberOfDpus': 123,
                                     'JobDurationInHour': 123.0
                                 }
                             }
                         },
                         'configurationSource': 'catalog'|'table'
                     }
                 },
             ],
             'Failures': [
                 {
                     'error': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     },
                     'catalogId': 'string',
                     'databaseName': 'string',
                     'tableName': 'string',
                     'type': 'compaction'|'retention'|'orphan_file_deletion'
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **TableOptimizers** *(list) --*

          A list of "BatchTableOptimizer" objects.

          * *(dict) --*

            Contains details for one of the table optimizers returned
            by the "BatchGetTableOptimizer" operation.

            * **catalogId** *(string) --*

              The Catalog ID of the table.

            * **databaseName** *(string) --*

              The name of the database in the catalog in which the
              table resides.

            * **tableName** *(string) --*

              The name of the table.

            * **tableOptimizer** *(dict) --*

              A "TableOptimizer" object that contains details on the
              configuration and last run of a table optimizer.

              * **type** *(string) --*

                The type of table optimizer. The valid values are:

                * "compaction": for managing compaction with a table
                  optimizer.

                * "retention": for managing the retention of snapshot
                  with a table optimizer.

                * "orphan_file_deletion": for managing the deletion of
                  orphan files with a table optimizer.

              * **configuration** *(dict) --*

                A "TableOptimizerConfiguration" object that was
                specified when creating or updating a table optimizer.

                * **roleArn** *(string) --*

                  A role passed by the caller which gives the service
                  permission to update the resources associated with
                  the optimizer on the caller's behalf.

                * **enabled** *(boolean) --*

                  Whether table optimization is enabled.

                * **vpcConfiguration** *(dict) --*

                  A "TableOptimizerVpcConfiguration" object
                  representing the VPC configuration for a table
                  optimizer.

                  This configuration is necessary to perform
                  optimization on tables that are in a customer VPC.

                  Note:

                    This is a Tagged Union structure. Only one of the
                    following top level keys will be set:
                    "glueConnectionName".     If a client receives an
                    unknown member it will     set
                    "SDK_UNKNOWN_MEMBER" as the top level key,
                    which maps to the name or tag of the unknown
                    member. The structure of "SDK_UNKNOWN_MEMBER" is
                    as follows:

                       'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}

                  * **glueConnectionName** *(string) --*

                    The name of the Glue connection used for the VPC
                    for the table optimizer.

                * **compactionConfiguration** *(dict) --*

                  The configuration for a compaction optimizer. This
                  configuration defines how data files in your table
                  will be compacted to improve query performance and
                  reduce storage costs.

                  * **icebergConfiguration** *(dict) --*

                    The configuration for an Iceberg compaction
                    optimizer.

                    * **strategy** *(string) --*

                      The strategy to use for compaction. Valid values
                      are:

                      * "binpack": Combines small files into larger
                        files, typically targeting sizes over 100MB,
                        while applying any pending deletes. This is
                        the recommended compaction strategy for most
                        use cases.

                      * "sort": Organizes data based on specified
                        columns which are sorted hierarchically during
                        compaction, improving query performance for
                        filtered operations. This strategy is
                        recommended when your queries frequently
                        filter on specific columns. To use this
                        strategy, you must first define a sort order
                        in your Iceberg table properties using the
                        "sort_order" table property.

                      * "z-order": Optimizes data organization by
                        blending multiple attributes into a single
                        scalar value that can be used for sorting,
                        allowing efficient querying across multiple
                        dimensions. This strategy is recommended when
                        you need to query data across multiple
                        dimensions simultaneously. To use this
                        strategy, you must first define a sort order
                        in your Iceberg table properties using the
                        "sort_order" table property.

                      If an input is not provided, the default value
                      'binpack' will be used.

                    * **minInputFiles** *(integer) --*

                      The minimum number of data files that must be
                      present in a partition before compaction will
                      actually compact files. This parameter helps
                      control when compaction is triggered, preventing
                      unnecessary compaction operations on partitions
                      with few files. If an input is not provided, the
                      default value 100 will be used.

                    * **deleteFileThreshold** *(integer) --*

                      The minimum number of deletes that must be
                      present in a data file to make it eligible for
                      compaction. This parameter helps optimize
                      compaction by focusing on files that contain a
                      significant number of delete operations, which
                      can improve query performance by removing
                      deleted records. If an input is not provided,
                      the default value 1 will be used.

                * **retentionConfiguration** *(dict) --*

                  The configuration for a snapshot retention
                  optimizer.

                  * **icebergConfiguration** *(dict) --*

                    The configuration for an Iceberg snapshot
                    retention optimizer.

                    * **snapshotRetentionPeriodInDays** *(integer) --*

                      The number of days to retain the Iceberg
                      snapshots. If an input is not provided, the
                      corresponding Iceberg table configuration field
                      will be used or if not present, the default
                      value 5 will be used.

                    * **numberOfSnapshotsToRetain** *(integer) --*

                      The number of Iceberg snapshots to retain within
                      the retention period. If an input is not
                      provided, the corresponding Iceberg table
                      configuration field will be used or if not
                      present, the default value 1 will be used.

                    * **cleanExpiredFiles** *(boolean) --*

                      If set to false, snapshots are only deleted from
                      table metadata, and the underlying data and
                      metadata files are not deleted.

                    * **runRateInHours** *(integer) --*

                      The interval in hours between retention job
                      runs. This parameter controls how frequently the
                      retention optimizer will run to clean up expired
                      snapshots. The value must be between 3 and 168
                      hours (7 days). If an input is not provided, the
                      default value 24 will be used.

                * **orphanFileDeletionConfiguration** *(dict) --*

                  The configuration for an orphan file deletion
                  optimizer.

                  * **icebergConfiguration** *(dict) --*

                    The configuration for an Iceberg orphan file
                    deletion optimizer.

                    * **orphanFileRetentionPeriodInDays** *(integer)
                      --*

                      The number of days that orphan files should be
                      retained before file deletion. If an input is
                      not provided, the default value 3 will be used.

                    * **location** *(string) --*

                      Specifies a directory in which to look for files
                      (defaults to the table's location). You may
                      choose a sub-directory rather than the top-level
                      table location.

                    * **runRateInHours** *(integer) --*

                      The interval in hours between orphan file
                      deletion job runs. This parameter controls how
                      frequently the orphan file deletion optimizer
                      will run to clean up orphan files. The value
                      must be between 3 and 168 hours (7 days). If an
                      input is not provided, the default value 24 will
                      be used.

              * **lastRun** *(dict) --*

                A "TableOptimizerRun" object representing the last run
                of the table optimizer.

                * **eventType** *(string) --*

                  An event type representing the status of the table
                  optimizer run.

                * **startTimestamp** *(datetime) --*

                  Represents the epoch timestamp at which the
                  compaction job was started within Lake Formation.

                * **endTimestamp** *(datetime) --*

                  Represents the epoch timestamp at which the
                  compaction job ended.

                * **metrics** *(dict) --*

                  A "RunMetrics" object containing metrics for the
                  optimizer run.

                  This member is deprecated. See the individual metric
                  members for compaction, retention, and orphan file
                  deletion.

                  * **NumberOfBytesCompacted** *(string) --*

                    The number of bytes removed by the compaction job
                    run.

                  * **NumberOfFilesCompacted** *(string) --*

                    The number of files removed by the compaction job
                    run.

                  * **NumberOfDpus** *(string) --*

                    The number of DPUs consumed by the job, rounded up
                    to the nearest whole number.

                  * **JobDurationInHour** *(string) --*

                    The duration of the job in hours.

                * **error** *(string) --*

                  An error that occured during the optimizer run.

                * **compactionMetrics** *(dict) --*

                  A "CompactionMetrics" object containing metrics for
                  the optimizer run.

                  * **IcebergMetrics** *(dict) --*

                    A structure containing the Iceberg compaction
                    metrics for the optimizer run.

                    * **NumberOfBytesCompacted** *(integer) --*

                      The number of bytes removed by the compaction
                      job run.

                    * **NumberOfFilesCompacted** *(integer) --*

                      The number of files removed by the compaction
                      job run.

                    * **DpuHours** *(float) --*

                      The number of DPU hours consumed by the job.

                    * **NumberOfDpus** *(integer) --*

                      The number of DPUs consumed by the job, rounded
                      up to the nearest whole number.

                    * **JobDurationInHour** *(float) --*

                      The duration of the job in hours.

                * **compactionStrategy** *(string) --*

                  The strategy used for the compaction run. Indicates
                  which algorithm was applied to determine how files
                  were selected and combined during the compaction
                  process. Valid values are:

                  * "binpack": Combines small files into larger files,
                    typically targeting sizes over 100MB, while
                    applying any pending deletes. This is the
                    recommended compaction strategy for most use
                    cases.

                  * "sort": Organizes data based on specified columns
                    which are sorted hierarchically during compaction,
                    improving query performance for filtered
                    operations. This strategy is recommended when your
                    queries frequently filter on specific columns. To
                    use this strategy, you must first define a sort
                    order in your Iceberg table properties using the
                    "sort_order" table property.

                  * "z-order": Optimizes data organization by blending
                    multiple attributes into a single scalar value
                    that can be used for sorting, allowing efficient
                    querying across multiple dimensions. This strategy
                    is recommended when you need to query data across
                    multiple dimensions simultaneously. To use this
                    strategy, you must first define a sort order in
                    your Iceberg table properties using the
                    "sort_order" table property.

                * **retentionMetrics** *(dict) --*

                  A "RetentionMetrics" object containing metrics for
                  the optimizer run.

                  * **IcebergMetrics** *(dict) --*

                    A structure containing the Iceberg retention
                    metrics for the optimizer run.

                    * **NumberOfDataFilesDeleted** *(integer) --*

                      The number of data files deleted by the
                      retention job run.

                    * **NumberOfManifestFilesDeleted** *(integer) --*

                      The number of manifest files deleted by the
                      retention job run.

                    * **NumberOfManifestListsDeleted** *(integer) --*

                      The number of manifest lists deleted by the
                      retention job run.

                    * **DpuHours** *(float) --*

                      The number of DPU hours consumed by the job.

                    * **NumberOfDpus** *(integer) --*

                      The number of DPUs consumed by the job, rounded
                      up to the nearest whole number.

                    * **JobDurationInHour** *(float) --*

                      The duration of the job in hours.

                * **orphanFileDeletionMetrics** *(dict) --*

                  An "OrphanFileDeletionMetrics" object containing
                  metrics for the optimizer run.

                  * **IcebergMetrics** *(dict) --*

                    A structure containing the Iceberg orphan file
                    deletion metrics for the optimizer run.

                    * **NumberOfOrphanFilesDeleted** *(integer) --*

                      The number of orphan files deleted by the orphan
                      file deletion job run.

                    * **DpuHours** *(float) --*

                      The number of DPU hours consumed by the job.

                    * **NumberOfDpus** *(integer) --*

                      The number of DPUs consumed by the job, rounded
                      up to the nearest whole number.

                    * **JobDurationInHour** *(float) --*

                      The duration of the job in hours.

              * **configurationSource** *(string) --*

                Specifies the source of the optimizer configuration.
                This indicates how the table optimizer was configured
                and which entity or service initiated the
                configuration.

        * **Failures** *(list) --*

          A list of errors from the operation.

          * *(dict) --*

            Contains details on one of the errors in the error list
            returned by the "BatchGetTableOptimizer" operation.

            * **error** *(dict) --*

              An "ErrorDetail" object containing code and message
              details about the error.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

            * **catalogId** *(string) --*

              The Catalog ID of the table.

            * **databaseName** *(string) --*

              The name of the database in the catalog in which the
              table resides.

            * **tableName** *(string) --*

              The name of the table.

            * **type** *(string) --*

              The type of table optimizer.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ThrottlingException"
Glue / Client / delete_ml_transform


delete_ml_transform
*******************

Glue.Client.delete_ml_transform(**kwargs)

   Deletes an Glue machine learning transform. Machine learning
   transforms are a special type of transform that use machine
   learning to learn the details of the transformation to be performed
   by learning from examples provided by humans. These transformations
   are then saved by Glue. If you no longer need a transform, you can
   delete it by calling "DeleteMLTransforms". However, any Glue jobs
   that still reference the deleted transform will no longer succeed.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_ml_transform(
          TransformId='string'
      )

   Parameters:
      **TransformId** (*string*) --

      **[REQUIRED]**

      The unique identifier of the transform to delete.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TransformId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TransformId** *(string) --*

          The unique identifier of the transform that was deleted.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / create_dev_endpoint


create_dev_endpoint
*******************

Glue.Client.create_dev_endpoint(**kwargs)

   Creates a new development endpoint.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_dev_endpoint(
          EndpointName='string',
          RoleArn='string',
          SecurityGroupIds=[
              'string',
          ],
          SubnetId='string',
          PublicKey='string',
          PublicKeys=[
              'string',
          ],
          NumberOfNodes=123,
          WorkerType='Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
          GlueVersion='string',
          NumberOfWorkers=123,
          ExtraPythonLibsS3Path='string',
          ExtraJarsS3Path='string',
          SecurityConfiguration='string',
          Tags={
              'string': 'string'
          },
          Arguments={
              'string': 'string'
          }
      )

   Parameters:
      * **EndpointName** (*string*) --

        **[REQUIRED]**

        The name to be assigned to the new "DevEndpoint".

      * **RoleArn** (*string*) --

        **[REQUIRED]**

        The IAM role for the "DevEndpoint".

      * **SecurityGroupIds** (*list*) --

        Security group IDs for the security groups to be used by the
        new "DevEndpoint".

        * *(string) --*

      * **SubnetId** (*string*) -- The subnet ID for the new
        "DevEndpoint" to use.

      * **PublicKey** (*string*) -- The public key to be used by this
        "DevEndpoint" for authentication. This attribute is provided
        for backward compatibility because the recommended attribute
        to use is public keys.

      * **PublicKeys** (*list*) --

        A list of public keys to be used by the development endpoints
        for authentication. The use of this attribute is preferred
        over a single public key because the public keys allow you to
        have a different private key per client.

        Note:

          If you previously created an endpoint with a public key, you
          must remove that key to be able to set a list of public
          keys. Call the "UpdateDevEndpoint" API with the public key
          content in the "deletePublicKeys" attribute, and the list of
          new keys in the "addPublicKeys" attribute.

        * *(string) --*

      * **NumberOfNodes** (*integer*) -- The number of Glue Data
        Processing Units (DPUs) to allocate to this "DevEndpoint".

      * **WorkerType** (*string*) --

        The type of predefined worker that is allocated to the
        development endpoint. Accepts a value of Standard, G.1X, or
        G.2X.

        * For the "Standard" worker type, each worker provides 4 vCPU,
          16 GB of memory and a 50GB disk, and 2 executors per worker.

        * For the "G.1X" worker type, each worker maps to 1 DPU (4
          vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor
          per worker. We recommend this worker type for memory-
          intensive jobs.

        * For the "G.2X" worker type, each worker maps to 2 DPU (8
          vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor
          per worker. We recommend this worker type for memory-
          intensive jobs.

        Known issue: when a development endpoint is created with the
        "G.2X" "WorkerType" configuration, the Spark drivers for the
        development endpoint will run on 4 vCPU, 16 GB of memory, and
        a 64 GB disk.

      * **GlueVersion** (*string*) --

        Glue version determines the versions of Apache Spark and
        Python that Glue supports. The Python version indicates the
        version supported for running your ETL scripts on development
        endpoints.

        For more information about the available Glue versions and
        corresponding Spark and Python versions, see Glue version in
        the developer guide.

        Development endpoints that are created without specifying a
        Glue version default to Glue 0.9.

        You can specify a version of Python support for development
        endpoints by using the "Arguments" parameter in the
        "CreateDevEndpoint" or "UpdateDevEndpoint" APIs. If no
        arguments are provided, the version defaults to Python 2.

      * **NumberOfWorkers** (*integer*) --

        The number of workers of a defined "workerType" that are
        allocated to the development endpoint.

        The maximum number of workers you can define are 299 for
        "G.1X", and 149 for "G.2X".

      * **ExtraPythonLibsS3Path** (*string*) --

        The paths to one or more Python libraries in an Amazon S3
        bucket that should be loaded in your "DevEndpoint". Multiple
        values must be complete paths separated by a comma.

        Note:

          You can only use pure Python libraries with a "DevEndpoint".
          Libraries that rely on C extensions, such as the pandas
          Python data analysis library, are not yet supported.

      * **ExtraJarsS3Path** (*string*) -- The path to one or more Java
        ".jar" files in an S3 bucket that should be loaded in your
        "DevEndpoint".

      * **SecurityConfiguration** (*string*) -- The name of the
        "SecurityConfiguration" structure to be used with this
        "DevEndpoint".

      * **Tags** (*dict*) --

        The tags to use with this DevEndpoint. You may use tags to
        limit access to the DevEndpoint. For more information about
        tags in Glue, see Amazon Web Services Tags in Glue in the
        developer guide.

        * *(string) --*

          * *(string) --*

      * **Arguments** (*dict*) --

        A map of arguments used to configure the "DevEndpoint".

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'EndpointName': 'string',
             'Status': 'string',
             'SecurityGroupIds': [
                 'string',
             ],
             'SubnetId': 'string',
             'RoleArn': 'string',
             'YarnEndpointAddress': 'string',
             'ZeppelinRemoteSparkInterpreterPort': 123,
             'NumberOfNodes': 123,
             'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
             'GlueVersion': 'string',
             'NumberOfWorkers': 123,
             'AvailabilityZone': 'string',
             'VpcId': 'string',
             'ExtraPythonLibsS3Path': 'string',
             'ExtraJarsS3Path': 'string',
             'FailureReason': 'string',
             'SecurityConfiguration': 'string',
             'CreatedTimestamp': datetime(2015, 1, 1),
             'Arguments': {
                 'string': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **EndpointName** *(string) --*

          The name assigned to the new "DevEndpoint".

        * **Status** *(string) --*

          The current status of the new "DevEndpoint".

        * **SecurityGroupIds** *(list) --*

          The security groups assigned to the new "DevEndpoint".

          * *(string) --*

        * **SubnetId** *(string) --*

          The subnet ID assigned to the new "DevEndpoint".

        * **RoleArn** *(string) --*

          The Amazon Resource Name (ARN) of the role assigned to the
          new "DevEndpoint".

        * **YarnEndpointAddress** *(string) --*

          The address of the YARN endpoint used by this "DevEndpoint".

        * **ZeppelinRemoteSparkInterpreterPort** *(integer) --*

          The Apache Zeppelin port for the remote Apache Spark
          interpreter.

        * **NumberOfNodes** *(integer) --*

          The number of Glue Data Processing Units (DPUs) allocated to
          this DevEndpoint.

        * **WorkerType** *(string) --*

          The type of predefined worker that is allocated to the
          development endpoint. May be a value of Standard, G.1X, or
          G.2X.

        * **GlueVersion** *(string) --*

          Glue version determines the versions of Apache Spark and
          Python that Glue supports. The Python version indicates the
          version supported for running your ETL scripts on
          development endpoints.

          For more information about the available Glue versions and
          corresponding Spark and Python versions, see Glue version in
          the developer guide.

        * **NumberOfWorkers** *(integer) --*

          The number of workers of a defined "workerType" that are
          allocated to the development endpoint.

        * **AvailabilityZone** *(string) --*

          The Amazon Web Services Availability Zone where this
          "DevEndpoint" is located.

        * **VpcId** *(string) --*

          The ID of the virtual private cloud (VPC) used by this
          "DevEndpoint".

        * **ExtraPythonLibsS3Path** *(string) --*

          The paths to one or more Python libraries in an S3 bucket
          that will be loaded in your "DevEndpoint".

        * **ExtraJarsS3Path** *(string) --*

          Path to one or more Java ".jar" files in an S3 bucket that
          will be loaded in your "DevEndpoint".

        * **FailureReason** *(string) --*

          The reason for a current failure in this "DevEndpoint".

        * **SecurityConfiguration** *(string) --*

          The name of the "SecurityConfiguration" structure being used
          with this "DevEndpoint".

        * **CreatedTimestamp** *(datetime) --*

          The point in time at which this "DevEndpoint" was created.

        * **Arguments** *(dict) --*

          The map of arguments used to configure this "DevEndpoint".

          Valid arguments are:

          * ""--enable-glue-datacatalog": """

          You can specify a version of Python support for development
          endpoints by using the "Arguments" parameter in the
          "CreateDevEndpoint" or "UpdateDevEndpoint" APIs. If no
          arguments are provided, the version defaults to Python 2.

          * *(string) --*

            * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.IdempotentParameterMismatchException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / list_data_quality_ruleset_evaluation_runs


list_data_quality_ruleset_evaluation_runs
*****************************************

Glue.Client.list_data_quality_ruleset_evaluation_runs(**kwargs)

   Lists all the runs meeting the filter criteria, where a ruleset is
   evaluated against a data source.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_data_quality_ruleset_evaluation_runs(
          Filter={
              'DataSource': {
                  'GlueTable': {
                      'DatabaseName': 'string',
                      'TableName': 'string',
                      'CatalogId': 'string',
                      'ConnectionName': 'string',
                      'AdditionalOptions': {
                          'string': 'string'
                      }
                  }
              },
              'StartedBefore': datetime(2015, 1, 1),
              'StartedAfter': datetime(2015, 1, 1)
          },
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **Filter** (*dict*) --

        The filter criteria.

        * **DataSource** *(dict) --* **[REQUIRED]**

          Filter based on a data source (an Glue table) associated
          with the run.

          * **GlueTable** *(dict) --* **[REQUIRED]**

            An Glue table.

            * **DatabaseName** *(string) --* **[REQUIRED]**

              A database name in the Glue Data Catalog.

            * **TableName** *(string) --* **[REQUIRED]**

              A table name in the Glue Data Catalog.

            * **CatalogId** *(string) --*

              A unique identifier for the Glue Data Catalog.

            * **ConnectionName** *(string) --*

              The name of the connection to the Glue Data Catalog.

            * **AdditionalOptions** *(dict) --*

              Additional options for the table. Currently there are
              two keys supported:

              * "pushDownPredicate": to filter on partitions without
                having to list and read all the files in your dataset.

              * "catalogPartitionPredicate": to use server-side
                partition pruning using partition indexes in the Glue
                Data Catalog.

              * *(string) --*

                * *(string) --*

        * **StartedBefore** *(datetime) --*

          Filter results by runs that started before this time.

        * **StartedAfter** *(datetime) --*

          Filter results by runs that started after this time.

      * **NextToken** (*string*) -- A paginated token to offset the
        results.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Runs': [
                 {
                     'RunId': 'string',
                     'Status': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT',
                     'StartedOn': datetime(2015, 1, 1),
                     'DataSource': {
                         'GlueTable': {
                             'DatabaseName': 'string',
                             'TableName': 'string',
                             'CatalogId': 'string',
                             'ConnectionName': 'string',
                             'AdditionalOptions': {
                                 'string': 'string'
                             }
                         }
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Runs** *(list) --*

          A list of "DataQualityRulesetEvaluationRunDescription"
          objects representing data quality ruleset runs.

          * *(dict) --*

            Describes the result of a data quality ruleset evaluation
            run.

            * **RunId** *(string) --*

              The unique run identifier associated with this run.

            * **Status** *(string) --*

              The status for this run.

            * **StartedOn** *(datetime) --*

              The date and time when the run started.

            * **DataSource** *(dict) --*

              The data source (an Glue table) associated with the run.

              * **GlueTable** *(dict) --*

                An Glue table.

                * **DatabaseName** *(string) --*

                  A database name in the Glue Data Catalog.

                * **TableName** *(string) --*

                  A table name in the Glue Data Catalog.

                * **CatalogId** *(string) --*

                  A unique identifier for the Glue Data Catalog.

                * **ConnectionName** *(string) --*

                  The name of the connection to the Glue Data Catalog.

                * **AdditionalOptions** *(dict) --*

                  Additional options for the table. Currently there
                  are two keys supported:

                  * "pushDownPredicate": to filter on partitions
                    without having to list and read all the files in
                    your dataset.

                  * "catalogPartitionPredicate": to use server-side
                    partition pruning using partition indexes in the
                    Glue Data Catalog.

                  * *(string) --*

                    * *(string) --*

        * **NextToken** *(string) --*

          A pagination token, if more results are available.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / untag_resource


untag_resource
**************

Glue.Client.untag_resource(**kwargs)

   Removes tags from a resource.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.untag_resource(
          ResourceArn='string',
          TagsToRemove=[
              'string',
          ]
      )

   Parameters:
      * **ResourceArn** (*string*) --

        **[REQUIRED]**

        The Amazon Resource Name (ARN) of the resource from which to
        remove the tags.

      * **TagsToRemove** (*list*) --

        **[REQUIRED]**

        Tags to remove from this resource.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.EntityNotFoundException"
Glue / Client / start_trigger


start_trigger
*************

Glue.Client.start_trigger(**kwargs)

   Starts an existing trigger. See Triggering Jobs for information
   about how different types of trigger are started.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_trigger(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the trigger to start.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the trigger that was started.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentRunsExceededException"
Glue / Client / get_dev_endpoints


get_dev_endpoints
*****************

Glue.Client.get_dev_endpoints(**kwargs)

   Retrieves all the development endpoints in this Amazon Web Services
   account.

   Note:

     When you create a development endpoint in a virtual private cloud
     (VPC), Glue returns only a private IP address and the public IP
     address field is not populated. When you create a non-VPC
     development endpoint, Glue returns only a public IP address.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_dev_endpoints(
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **MaxResults** (*integer*) -- The maximum size of information
        to return.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'DevEndpoints': [
                 {
                     'EndpointName': 'string',
                     'RoleArn': 'string',
                     'SecurityGroupIds': [
                         'string',
                     ],
                     'SubnetId': 'string',
                     'YarnEndpointAddress': 'string',
                     'PrivateAddress': 'string',
                     'ZeppelinRemoteSparkInterpreterPort': 123,
                     'PublicAddress': 'string',
                     'Status': 'string',
                     'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                     'GlueVersion': 'string',
                     'NumberOfWorkers': 123,
                     'NumberOfNodes': 123,
                     'AvailabilityZone': 'string',
                     'VpcId': 'string',
                     'ExtraPythonLibsS3Path': 'string',
                     'ExtraJarsS3Path': 'string',
                     'FailureReason': 'string',
                     'LastUpdateStatus': 'string',
                     'CreatedTimestamp': datetime(2015, 1, 1),
                     'LastModifiedTimestamp': datetime(2015, 1, 1),
                     'PublicKey': 'string',
                     'PublicKeys': [
                         'string',
                     ],
                     'SecurityConfiguration': 'string',
                     'Arguments': {
                         'string': 'string'
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **DevEndpoints** *(list) --*

          A list of "DevEndpoint" definitions.

          * *(dict) --*

            A development endpoint where a developer can remotely
            debug extract, transform, and load (ETL) scripts.

            * **EndpointName** *(string) --*

              The name of the "DevEndpoint".

            * **RoleArn** *(string) --*

              The Amazon Resource Name (ARN) of the IAM role used in
              this "DevEndpoint".

            * **SecurityGroupIds** *(list) --*

              A list of security group identifiers used in this
              "DevEndpoint".

              * *(string) --*

            * **SubnetId** *(string) --*

              The subnet ID for this "DevEndpoint".

            * **YarnEndpointAddress** *(string) --*

              The YARN endpoint address used by this "DevEndpoint".

            * **PrivateAddress** *(string) --*

              A private IP address to access the "DevEndpoint" within
              a VPC if the "DevEndpoint" is created within one. The
              "PrivateAddress" field is present only when you create
              the "DevEndpoint" within your VPC.

            * **ZeppelinRemoteSparkInterpreterPort** *(integer) --*

              The Apache Zeppelin port for the remote Apache Spark
              interpreter.

            * **PublicAddress** *(string) --*

              The public IP address used by this "DevEndpoint". The
              "PublicAddress" field is present only when you create a
              non-virtual private cloud (VPC) "DevEndpoint".

            * **Status** *(string) --*

              The current status of this "DevEndpoint".

            * **WorkerType** *(string) --*

              The type of predefined worker that is allocated to the
              development endpoint. Accepts a value of Standard, G.1X,
              or G.2X.

              * For the "Standard" worker type, each worker provides 4
                vCPU, 16 GB of memory and a 50GB disk, and 2 executors
                per worker.

              * For the "G.1X" worker type, each worker maps to 1 DPU
                (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1
                executor per worker. We recommend this worker type for
                memory-intensive jobs.

              * For the "G.2X" worker type, each worker maps to 2 DPU
                (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1
                executor per worker. We recommend this worker type for
                memory-intensive jobs.

              Known issue: when a development endpoint is created with
              the "G.2X" "WorkerType" configuration, the Spark drivers
              for the development endpoint will run on 4 vCPU, 16 GB
              of memory, and a 64 GB disk.

            * **GlueVersion** *(string) --*

              Glue version determines the versions of Apache Spark and
              Python that Glue supports. The Python version indicates
              the version supported for running your ETL scripts on
              development endpoints.

              For more information about the available Glue versions
              and corresponding Spark and Python versions, see Glue
              version in the developer guide.

              Development endpoints that are created without
              specifying a Glue version default to Glue 0.9.

              You can specify a version of Python support for
              development endpoints by using the "Arguments" parameter
              in the "CreateDevEndpoint" or "UpdateDevEndpoint" APIs.
              If no arguments are provided, the version defaults to
              Python 2.

            * **NumberOfWorkers** *(integer) --*

              The number of workers of a defined "workerType" that are
              allocated to the development endpoint.

              The maximum number of workers you can define are 299 for
              "G.1X", and 149 for "G.2X".

            * **NumberOfNodes** *(integer) --*

              The number of Glue Data Processing Units (DPUs)
              allocated to this "DevEndpoint".

            * **AvailabilityZone** *(string) --*

              The Amazon Web Services Availability Zone where this
              "DevEndpoint" is located.

            * **VpcId** *(string) --*

              The ID of the virtual private cloud (VPC) used by this
              "DevEndpoint".

            * **ExtraPythonLibsS3Path** *(string) --*

              The paths to one or more Python libraries in an Amazon
              S3 bucket that should be loaded in your "DevEndpoint".
              Multiple values must be complete paths separated by a
              comma.

              Note:

                You can only use pure Python libraries with a
                "DevEndpoint". Libraries that rely on C extensions,
                such as the pandas Python data analysis library, are
                not currently supported.

            * **ExtraJarsS3Path** *(string) --*

              The path to one or more Java ".jar" files in an S3
              bucket that should be loaded in your "DevEndpoint".

              Note:

                You can only use pure Java/Scala libraries with a
                "DevEndpoint".

            * **FailureReason** *(string) --*

              The reason for a current failure in this "DevEndpoint".

            * **LastUpdateStatus** *(string) --*

              The status of the last update.

            * **CreatedTimestamp** *(datetime) --*

              The point in time at which this DevEndpoint was created.

            * **LastModifiedTimestamp** *(datetime) --*

              The point in time at which this "DevEndpoint" was last
              modified.

            * **PublicKey** *(string) --*

              The public key to be used by this "DevEndpoint" for
              authentication. This attribute is provided for backward
              compatibility because the recommended attribute to use
              is public keys.

            * **PublicKeys** *(list) --*

              A list of public keys to be used by the "DevEndpoints"
              for authentication. Using this attribute is preferred
              over a single public key because the public keys allow
              you to have a different private key per client.

              Note:

                If you previously created an endpoint with a public
                key, you must remove that key to be able to set a list
                of public keys. Call the "UpdateDevEndpoint" API
                operation with the public key content in the
                "deletePublicKeys" attribute, and the list of new keys
                in the "addPublicKeys" attribute.

              * *(string) --*

            * **SecurityConfiguration** *(string) --*

              The name of the "SecurityConfiguration" structure to be
              used with this "DevEndpoint".

            * **Arguments** *(dict) --*

              A map of arguments used to configure the "DevEndpoint".

              Valid arguments are:

              * ""--enable-glue-datacatalog": """

              You can specify a version of Python support for
              development endpoints by using the "Arguments" parameter
              in the "CreateDevEndpoint" or "UpdateDevEndpoint" APIs.
              If no arguments are provided, the version defaults to
              Python 2.

              * *(string) --*

                * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if not all "DevEndpoint" definitions
          have yet been returned.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / get_data_catalog_encryption_settings


get_data_catalog_encryption_settings
************************************

Glue.Client.get_data_catalog_encryption_settings(**kwargs)

   Retrieves the security configuration for a specified catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_data_catalog_encryption_settings(
          CatalogId='string'
      )

   Parameters:
      **CatalogId** (*string*) -- The ID of the Data Catalog to
      retrieve the security configuration for. If none is provided,
      the Amazon Web Services account ID is used by default.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'DataCatalogEncryptionSettings': {
                 'EncryptionAtRest': {
                     'CatalogEncryptionMode': 'DISABLED'|'SSE-KMS'|'SSE-KMS-WITH-SERVICE-ROLE',
                     'SseAwsKmsKeyId': 'string',
                     'CatalogEncryptionServiceRole': 'string'
                 },
                 'ConnectionPasswordEncryption': {
                     'ReturnConnectionPasswordEncrypted': True|False,
                     'AwsKmsKeyId': 'string'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **DataCatalogEncryptionSettings** *(dict) --*

          The requested security configuration.

          * **EncryptionAtRest** *(dict) --*

            Specifies the encryption-at-rest configuration for the
            Data Catalog.

            * **CatalogEncryptionMode** *(string) --*

              The encryption-at-rest mode for encrypting Data Catalog
              data.

            * **SseAwsKmsKeyId** *(string) --*

              The ID of the KMS key to use for encryption at rest.

            * **CatalogEncryptionServiceRole** *(string) --*

              The role that Glue assumes to encrypt and decrypt the
              Data Catalog objects on the caller's behalf.

          * **ConnectionPasswordEncryption** *(dict) --*

            When connection password protection is enabled, the Data
            Catalog uses a customer-provided key to encrypt the
            password as part of "CreateConnection" or
            "UpdateConnection" and store it in the
            "ENCRYPTED_PASSWORD" field in the connection properties.
            You can enable catalog encryption or only password
            encryption.

            * **ReturnConnectionPasswordEncrypted** *(boolean) --*

              When the "ReturnConnectionPasswordEncrypted" flag is set
              to "true", passwords remain encrypted in the responses
              of "GetConnection" and "GetConnections". This encryption
              takes effect independently from catalog encryption.

            * **AwsKmsKeyId** *(string) --*

              An KMS key that is used to encrypt the connection
              password.

              If connection password protection is enabled, the caller
              of "CreateConnection" and "UpdateConnection" needs at
              least "kms:Encrypt" permission on the specified KMS key,
              to encrypt passwords before storing them in the Data
              Catalog.

              You can set the decrypt permission to enable or restrict
              access on the password key according to your security
              requirements.

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_connection


get_connection
**************

Glue.Client.get_connection(**kwargs)

   Retrieves a connection definition from the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_connection(
          CatalogId='string',
          Name='string',
          HidePassword=True|False,
          ApplyOverrideForComputeEnvironment='SPARK'|'ATHENA'|'PYTHON'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which the connection resides. If none is provided, the Amazon
        Web Services account ID is used by default.

      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the connection definition to retrieve.

      * **HidePassword** (*boolean*) -- Allows you to retrieve the
        connection metadata without returning the password. For
        instance, the Glue console uses this flag to retrieve the
        connection, and does not display the password. Set this
        parameter when the caller might not have permission to use the
        KMS key to decrypt the password, but it does have permission
        to access the rest of the connection properties.

      * **ApplyOverrideForComputeEnvironment** (*string*) -- For
        connections that may be used in multiple services, specifies
        returning properties for the specified compute environment.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Connection': {
                 'Name': 'string',
                 'Description': 'string',
                 'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
                 'MatchCriteria': [
                     'string',
                 ],
                 'ConnectionProperties': {
                     'string': 'string'
                 },
                 'SparkProperties': {
                     'string': 'string'
                 },
                 'AthenaProperties': {
                     'string': 'string'
                 },
                 'PythonProperties': {
                     'string': 'string'
                 },
                 'PhysicalConnectionRequirements': {
                     'SubnetId': 'string',
                     'SecurityGroupIdList': [
                         'string',
                     ],
                     'AvailabilityZone': 'string'
                 },
                 'CreationTime': datetime(2015, 1, 1),
                 'LastUpdatedTime': datetime(2015, 1, 1),
                 'LastUpdatedBy': 'string',
                 'Status': 'READY'|'IN_PROGRESS'|'FAILED',
                 'StatusReason': 'string',
                 'LastConnectionValidationTime': datetime(2015, 1, 1),
                 'AuthenticationConfiguration': {
                     'AuthenticationType': 'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                     'SecretArn': 'string',
                     'OAuth2Properties': {
                         'OAuth2GrantType': 'AUTHORIZATION_CODE'|'CLIENT_CREDENTIALS'|'JWT_BEARER',
                         'OAuth2ClientApplication': {
                             'UserManagedClientApplicationClientId': 'string',
                             'AWSManagedClientApplicationReference': 'string'
                         },
                         'TokenUrl': 'string',
                         'TokenUrlParametersMap': {
                             'string': 'string'
                         }
                     }
                 },
                 'ConnectionSchemaVersion': 123,
                 'CompatibleComputeEnvironments': [
                     'SPARK'|'ATHENA'|'PYTHON',
                 ]
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Connection** *(dict) --*

          The requested connection definition.

          * **Name** *(string) --*

            The name of the connection definition.

          * **Description** *(string) --*

            The description of the connection.

          * **ConnectionType** *(string) --*

            The type of the connection. Currently, SFTP is not
            supported.

          * **MatchCriteria** *(list) --*

            A list of criteria that can be used in selecting this
            connection.

            * *(string) --*

          * **ConnectionProperties** *(dict) --*

            These key-value pairs define parameters for the connection
            when using the version 1 Connection schema:

            * "HOST" - The host URI: either the fully qualified domain
              name (FQDN) or the IPv4 address of the database host.

            * "PORT" - The port number, between 1024 and 65535, of the
              port on which the database host is listening for
              database connections.

            * "USER_NAME" - The name under which to log in to the
              database. The value string for "USER_NAME" is "
              "USERNAME"".

            * "PASSWORD" - A password, if one is used, for the user
              name.

            * "ENCRYPTED_PASSWORD" - When you enable connection
              password protection by setting
              "ConnectionPasswordEncryption" in the Data Catalog
              encryption settings, this field stores the encrypted
              password.

            * "JDBC_DRIVER_JAR_URI" - The Amazon Simple Storage
              Service (Amazon S3) path of the JAR file that contains
              the JDBC driver to use.

            * "JDBC_DRIVER_CLASS_NAME" - The class name of the JDBC
              driver to use.

            * "JDBC_ENGINE" - The name of the JDBC engine to use.

            * "JDBC_ENGINE_VERSION" - The version of the JDBC engine
              to use.

            * "CONFIG_FILES" - (Reserved for future use.)

            * "INSTANCE_ID" - The instance ID to use.

            * "JDBC_CONNECTION_URL" - The URL for connecting to a JDBC
              data source.

            * "JDBC_ENFORCE_SSL" - A Boolean string (true, false)
              specifying whether Secure Sockets Layer (SSL) with
              hostname matching is enforced for the JDBC connection on
              the client. The default is false.

            * "CUSTOM_JDBC_CERT" - An Amazon S3 location specifying
              the customer's root certificate. Glue uses this root
              certificate to validate the customer’s certificate when
              connecting to the customer database. Glue only handles
              X.509 certificates. The certificate provided must be
              DER-encoded and supplied in Base64 encoding PEM format.

            * "SKIP_CUSTOM_JDBC_CERT_VALIDATION" - By default, this is
              "false". Glue validates the Signature algorithm and
              Subject Public Key Algorithm for the customer
              certificate. The only permitted algorithms for the
              Signature algorithm are SHA256withRSA, SHA384withRSA or
              SHA512withRSA. For the Subject Public Key Algorithm, the
              key length must be at least 2048. You can set the value
              of this property to "true" to skip Glue’s validation of
              the customer certificate.

            * "CUSTOM_JDBC_CERT_STRING" - A custom JDBC certificate
              string which is used for domain match or distinguished
              name match to prevent a man-in-the-middle attack. In
              Oracle database, this is used as the
              "SSL_SERVER_CERT_DN"; in Microsoft SQL Server, this is
              used as the "hostNameInCertificate".

            * "CONNECTION_URL" - The URL for connecting to a general
              (non-JDBC) data source.

            * "SECRET_ID" - The secret ID used for the secret manager
              of credentials.

            * "CONNECTOR_URL" - The connector URL for a MARKETPLACE or
              CUSTOM connection.

            * "CONNECTOR_TYPE" - The connector type for a MARKETPLACE
              or CUSTOM connection.

            * "CONNECTOR_CLASS_NAME" - The connector class name for a
              MARKETPLACE or CUSTOM connection.

            * "KAFKA_BOOTSTRAP_SERVERS" - A comma-separated list of
              host and port pairs that are the addresses of the Apache
              Kafka brokers in a Kafka cluster to which a Kafka client
              will connect to and bootstrap itself.

            * "KAFKA_SSL_ENABLED" - Whether to enable or disable SSL
              on an Apache Kafka connection. Default value is "true".

            * "KAFKA_CUSTOM_CERT" - The Amazon S3 URL for the private
              CA cert file (.pem format). The default is an empty
              string.

            * "KAFKA_SKIP_CUSTOM_CERT_VALIDATION" - Whether to skip
              the validation of the CA cert file or not. Glue
              validates for three algorithms: SHA256withRSA,
              SHA384withRSA and SHA512withRSA. Default value is
              "false".

            * "KAFKA_CLIENT_KEYSTORE" - The Amazon S3 location of the
              client keystore file for Kafka client side
              authentication (Optional).

            * "KAFKA_CLIENT_KEYSTORE_PASSWORD" - The password to
              access the provided keystore (Optional).

            * "KAFKA_CLIENT_KEY_PASSWORD" - A keystore can consist of
              multiple keys, so this is the password to access the
              client key to be used with the Kafka server side key
              (Optional).

            * "ENCRYPTED_KAFKA_CLIENT_KEYSTORE_PASSWORD" - The
              encrypted version of the Kafka client keystore password
              (if the user has the Glue encrypt passwords setting
              selected).

            * "ENCRYPTED_KAFKA_CLIENT_KEY_PASSWORD" - The encrypted
              version of the Kafka client key password (if the user
              has the Glue encrypt passwords setting selected).

            * "KAFKA_SASL_MECHANISM" - ""SCRAM-SHA-512"", ""GSSAPI"",
              ""AWS_MSK_IAM"", or ""PLAIN"". These are the supported
              SASL Mechanisms.

            * "KAFKA_SASL_PLAIN_USERNAME" - A plaintext username used
              to authenticate with the "PLAIN" mechanism.

            * "KAFKA_SASL_PLAIN_PASSWORD" - A plaintext password used
              to authenticate with the "PLAIN" mechanism.

            * "ENCRYPTED_KAFKA_SASL_PLAIN_PASSWORD" - The encrypted
              version of the Kafka SASL PLAIN password (if the user
              has the Glue encrypt passwords setting selected).

            * "KAFKA_SASL_SCRAM_USERNAME" - A plaintext username used
              to authenticate with the "SCRAM-SHA-512" mechanism.

            * "KAFKA_SASL_SCRAM_PASSWORD" - A plaintext password used
              to authenticate with the "SCRAM-SHA-512" mechanism.

            * "ENCRYPTED_KAFKA_SASL_SCRAM_PASSWORD" - The encrypted
              version of the Kafka SASL SCRAM password (if the user
              has the Glue encrypt passwords setting selected).

            * "KAFKA_SASL_SCRAM_SECRETS_ARN" - The Amazon Resource
              Name of a secret in Amazon Web Services Secrets Manager.

            * "KAFKA_SASL_GSSAPI_KEYTAB" - The S3 location of a
              Kerberos "keytab" file. A keytab stores long-term keys
              for one or more principals. For more information, see
              MIT Kerberos Documentation: Keytab.

            * "KAFKA_SASL_GSSAPI_KRB5_CONF" - The S3 location of a
              Kerberos "krb5.conf" file. A krb5.conf stores Kerberos
              configuration information, such as the location of the
              KDC server. For more information, see MIT Kerberos
              Documentation: krb5.conf.

            * "KAFKA_SASL_GSSAPI_SERVICE" - The Kerberos service name,
              as set with "sasl.kerberos.service.name" in your Kafka
              Configuration.

            * "KAFKA_SASL_GSSAPI_PRINCIPAL" - The name of the Kerberos
              princial used by Glue. For more information, see Kafka
              Documentation: Configuring Kafka Brokers.

            * "ROLE_ARN" - The role to be used for running queries.

            * "REGION" - The Amazon Web Services Region where queries
              will be run.

            * "WORKGROUP_NAME" - The name of an Amazon Redshift
              serverless workgroup or Amazon Athena workgroup in which
              queries will run.

            * "CLUSTER_IDENTIFIER" - The cluster identifier of an
              Amazon Redshift cluster in which queries will run.

            * "DATABASE" - The Amazon Redshift database that you are
              connecting to.

            * *(string) --*

              * *(string) --*

          * **SparkProperties** *(dict) --*

            Connection properties specific to the Spark compute
            environment.

            * *(string) --*

              * *(string) --*

          * **AthenaProperties** *(dict) --*

            Connection properties specific to the Athena compute
            environment.

            * *(string) --*

              * *(string) --*

          * **PythonProperties** *(dict) --*

            Connection properties specific to the Python compute
            environment.

            * *(string) --*

              * *(string) --*

          * **PhysicalConnectionRequirements** *(dict) --*

            The physical connection requirements, such as virtual
            private cloud (VPC) and "SecurityGroup", that are needed
            to make this connection successfully.

            * **SubnetId** *(string) --*

              The subnet ID used by the connection.

            * **SecurityGroupIdList** *(list) --*

              The security group ID list used by the connection.

              * *(string) --*

            * **AvailabilityZone** *(string) --*

              The connection's Availability Zone.

          * **CreationTime** *(datetime) --*

            The timestamp of the time that this connection definition
            was created.

          * **LastUpdatedTime** *(datetime) --*

            The timestamp of the last time the connection definition
            was updated.

          * **LastUpdatedBy** *(string) --*

            The user, group, or role that last updated this connection
            definition.

          * **Status** *(string) --*

            The status of the connection. Can be one of: "READY",
            "IN_PROGRESS", or "FAILED".

          * **StatusReason** *(string) --*

            The reason for the connection status.

          * **LastConnectionValidationTime** *(datetime) --*

            A timestamp of the time this connection was last
            validated.

          * **AuthenticationConfiguration** *(dict) --*

            The authentication properties of the connection.

            * **AuthenticationType** *(string) --*

              A structure containing the authentication configuration.

            * **SecretArn** *(string) --*

              The secret manager ARN to store credentials.

            * **OAuth2Properties** *(dict) --*

              The properties for OAuth2 authentication.

              * **OAuth2GrantType** *(string) --*

                The OAuth2 grant type. For example,
                "AUTHORIZATION_CODE", "JWT_BEARER", or
                "CLIENT_CREDENTIALS".

              * **OAuth2ClientApplication** *(dict) --*

                The client application type. For example, AWS_MANAGED
                or USER_MANAGED.

                * **UserManagedClientApplicationClientId** *(string)
                  --*

                  The client application clientID if the ClientAppType
                  is "USER_MANAGED".

                * **AWSManagedClientApplicationReference** *(string)
                  --*

                  The reference to the SaaS-side client app that is
                  Amazon Web Services managed.

              * **TokenUrl** *(string) --*

                The URL of the provider's authentication server, to
                exchange an authorization code for an access token.

              * **TokenUrlParametersMap** *(dict) --*

                A map of parameters that are added to the token "GET"
                request.

                * *(string) --*

                  * *(string) --*

          * **ConnectionSchemaVersion** *(integer) --*

            The version of the connection schema for this connection.
            Version 2 supports properties for specific compute
            environments.

          * **CompatibleComputeEnvironments** *(list) --*

            A list of compute environments compatible with the
            connection.

            * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_security_configurations


get_security_configurations
***************************

Glue.Client.get_security_configurations(**kwargs)

   Retrieves a list of all security configurations.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_security_configurations(
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **MaxResults** (*integer*) -- The maximum number of results to
        return.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SecurityConfigurations': [
                 {
                     'Name': 'string',
                     'CreatedTimeStamp': datetime(2015, 1, 1),
                     'EncryptionConfiguration': {
                         'S3Encryption': [
                             {
                                 'S3EncryptionMode': 'DISABLED'|'SSE-KMS'|'SSE-S3',
                                 'KmsKeyArn': 'string'
                             },
                         ],
                         'CloudWatchEncryption': {
                             'CloudWatchEncryptionMode': 'DISABLED'|'SSE-KMS',
                             'KmsKeyArn': 'string'
                         },
                         'JobBookmarksEncryption': {
                             'JobBookmarksEncryptionMode': 'DISABLED'|'CSE-KMS',
                             'KmsKeyArn': 'string'
                         },
                         'DataQualityEncryption': {
                             'DataQualityEncryptionMode': 'DISABLED'|'SSE-KMS',
                             'KmsKeyArn': 'string'
                         }
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **SecurityConfigurations** *(list) --*

          A list of security configurations.

          * *(dict) --*

            Specifies a security configuration.

            * **Name** *(string) --*

              The name of the security configuration.

            * **CreatedTimeStamp** *(datetime) --*

              The time at which this security configuration was
              created.

            * **EncryptionConfiguration** *(dict) --*

              The encryption configuration associated with this
              security configuration.

              * **S3Encryption** *(list) --*

                The encryption configuration for Amazon Simple Storage
                Service (Amazon S3) data.

                * *(dict) --*

                  Specifies how Amazon Simple Storage Service (Amazon
                  S3) data should be encrypted.

                  * **S3EncryptionMode** *(string) --*

                    The encryption mode to use for Amazon S3 data.

                  * **KmsKeyArn** *(string) --*

                    The Amazon Resource Name (ARN) of the KMS key to
                    be used to encrypt the data.

              * **CloudWatchEncryption** *(dict) --*

                The encryption configuration for Amazon CloudWatch.

                * **CloudWatchEncryptionMode** *(string) --*

                  The encryption mode to use for CloudWatch data.

                * **KmsKeyArn** *(string) --*

                  The Amazon Resource Name (ARN) of the KMS key to be
                  used to encrypt the data.

              * **JobBookmarksEncryption** *(dict) --*

                The encryption configuration for job bookmarks.

                * **JobBookmarksEncryptionMode** *(string) --*

                  The encryption mode to use for job bookmarks data.

                * **KmsKeyArn** *(string) --*

                  The Amazon Resource Name (ARN) of the KMS key to be
                  used to encrypt the data.

              * **DataQualityEncryption** *(dict) --*

                The encryption configuration for Glue Data Quality
                assets.

                * **DataQualityEncryptionMode** *(string) --*

                  The encryption mode to use for encrypting Data
                  Quality assets. These assets include data quality
                  rulesets, results, statistics, anomaly detection
                  models and observations.

                  Valid values are "SSEKMS" for encryption using a
                  customer-managed KMS key, or "DISABLED".

                * **KmsKeyArn** *(string) --*

                  The Amazon Resource Name (ARN) of the KMS key to be
                  used to encrypt the data.

        * **NextToken** *(string) --*

          A continuation token, if there are more security
          configurations to return.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / update_catalog


update_catalog
**************

Glue.Client.update_catalog(**kwargs)

   Updates an existing catalog's properties in the Glue Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_catalog(
          CatalogId='string',
          CatalogInput={
              'Description': 'string',
              'FederatedCatalog': {
                  'Identifier': 'string',
                  'ConnectionName': 'string',
                  'ConnectionType': 'string'
              },
              'Parameters': {
                  'string': 'string'
              },
              'TargetRedshiftCatalog': {
                  'CatalogArn': 'string'
              },
              'CatalogProperties': {
                  'DataLakeAccessProperties': {
                      'DataLakeAccess': True|False,
                      'DataTransferRole': 'string',
                      'KmsKey': 'string',
                      'CatalogType': 'string'
                  },
                  'IcebergOptimizationProperties': {
                      'RoleArn': 'string',
                      'Compaction': {
                          'string': 'string'
                      },
                      'Retention': {
                          'string': 'string'
                      },
                      'OrphanFileDeletion': {
                          'string': 'string'
                      }
                  },
                  'CustomProperties': {
                      'string': 'string'
                  }
              },
              'CreateTableDefaultPermissions': [
                  {
                      'Principal': {
                          'DataLakePrincipalIdentifier': 'string'
                      },
                      'Permissions': [
                          'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                      ]
                  },
              ],
              'CreateDatabaseDefaultPermissions': [
                  {
                      'Principal': {
                          'DataLakePrincipalIdentifier': 'string'
                      },
                      'Permissions': [
                          'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                      ]
                  },
              ],
              'AllowFullTableExternalDataAccess': 'True'|'False'
          }
      )

   Parameters:
      * **CatalogId** (*string*) --

        **[REQUIRED]**

        The ID of the catalog.

      * **CatalogInput** (*dict*) --

        **[REQUIRED]**

        A "CatalogInput" object specifying the new properties of an
        existing catalog.

        * **Description** *(string) --*

          Description string, not more than 2048 bytes long, matching
          the URI address multi-line string pattern. A description of
          the catalog.

        * **FederatedCatalog** *(dict) --*

          A "FederatedCatalog" object. A "FederatedCatalog" structure
          that references an entity outside the Glue Data Catalog, for
          example a Redshift database.

          * **Identifier** *(string) --*

            A unique identifier for the federated catalog.

          * **ConnectionName** *(string) --*

            The name of the connection to an external data source, for
            example a Redshift-federated catalog.

          * **ConnectionType** *(string) --*

            The type of connection used to access the federated
            catalog, specifying the protocol or method for connection
            to the external data source.

        * **Parameters** *(dict) --*

          A map array of key-value pairs that define the parameters
          and properties of the catalog.

          * *(string) --*

            * *(string) --*

        * **TargetRedshiftCatalog** *(dict) --*

          A "TargetRedshiftCatalog" object that describes a target
          catalog for resource linking.

          * **CatalogArn** *(string) --* **[REQUIRED]**

            The Amazon Resource Name (ARN) of the catalog resource.

        * **CatalogProperties** *(dict) --*

          A "CatalogProperties" object that specifies data lake access
          properties and other custom properties.

          * **DataLakeAccessProperties** *(dict) --*

            A "DataLakeAccessProperties" object that specifies
            properties to configure data lake access for your catalog
            resource in the Glue Data Catalog.

            * **DataLakeAccess** *(boolean) --*

              Turns on or off data lake access for Apache Spark
              applications that access Amazon Redshift databases in
              the Data Catalog from any non-Redshift engine, such as
              Amazon Athena, Amazon EMR, or Glue ETL.

            * **DataTransferRole** *(string) --*

              A role that will be assumed by Glue for transferring
              data into/out of the staging bucket during a query.

            * **KmsKey** *(string) --*

              An encryption key that will be used for the staging
              bucket that will be created along with the catalog.

            * **CatalogType** *(string) --*

              Specifies a federated catalog type for the native
              catalog resource. The currently supported type is
              "aws:redshift".

          * **IcebergOptimizationProperties** *(dict) --*

            A structure that specifies Iceberg table optimization
            properties for the catalog. This includes configuration
            for compaction, retention, and orphan file deletion
            operations that can be applied to Iceberg tables in this
            catalog.

            * **RoleArn** *(string) --*

              The Amazon Resource Name (ARN) of the IAM role that will
              be assumed to perform Iceberg table optimization
              operations.

            * **Compaction** *(dict) --*

              A map of key-value pairs that specify configuration
              parameters for Iceberg table compaction operations,
              which optimize the layout of data files to improve query
              performance.

              * *(string) --*

                * *(string) --*

            * **Retention** *(dict) --*

              A map of key-value pairs that specify configuration
              parameters for Iceberg table retention operations, which
              manage the lifecycle of table snapshots to control
              storage costs.

              * *(string) --*

                * *(string) --*

            * **OrphanFileDeletion** *(dict) --*

              A map of key-value pairs that specify configuration
              parameters for Iceberg orphan file deletion operations,
              which identify and remove files that are no longer
              referenced by the table metadata.

              * *(string) --*

                * *(string) --*

          * **CustomProperties** *(dict) --*

            Additional key-value properties for the catalog, such as
            column statistics optimizations.

            * *(string) --*

              * *(string) --*

        * **CreateTableDefaultPermissions** *(list) --*

          An array of "PrincipalPermissions" objects. Creates a set of
          default permissions on the table(s) for principals. Used by
          Amazon Web Services Lake Formation. Typically should be
          explicitly set as an empty list.

          * *(dict) --*

            Permissions granted to a principal.

            * **Principal** *(dict) --*

              The principal who is granted permissions.

              * **DataLakePrincipalIdentifier** *(string) --*

                An identifier for the Lake Formation principal.

            * **Permissions** *(list) --*

              The permissions that are granted to the principal.

              * *(string) --*

        * **CreateDatabaseDefaultPermissions** *(list) --*

          An array of "PrincipalPermissions" objects. Creates a set of
          default permissions on the database(s) for principals. Used
          by Amazon Web Services Lake Formation. Typically should be
          explicitly set as an empty list.

          * *(dict) --*

            Permissions granted to a principal.

            * **Principal** *(dict) --*

              The principal who is granted permissions.

              * **DataLakePrincipalIdentifier** *(string) --*

                An identifier for the Lake Formation principal.

            * **Permissions** *(list) --*

              The permissions that are granted to the principal.

              * *(string) --*

        * **AllowFullTableExternalDataAccess** *(string) --*

          Allows third-party engines to access data in Amazon S3
          locations that are registered with Lake Formation.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.FederationSourceException"
Glue / Client / start_data_quality_ruleset_evaluation_run


start_data_quality_ruleset_evaluation_run
*****************************************

Glue.Client.start_data_quality_ruleset_evaluation_run(**kwargs)

   Once you have a ruleset definition (either recommended or your
   own), you call this operation to evaluate the ruleset against a
   data source (Glue table). The evaluation computes results which you
   can retrieve with the "GetDataQualityResult" API.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_data_quality_ruleset_evaluation_run(
          DataSource={
              'GlueTable': {
                  'DatabaseName': 'string',
                  'TableName': 'string',
                  'CatalogId': 'string',
                  'ConnectionName': 'string',
                  'AdditionalOptions': {
                      'string': 'string'
                  }
              }
          },
          Role='string',
          NumberOfWorkers=123,
          Timeout=123,
          ClientToken='string',
          AdditionalRunOptions={
              'CloudWatchMetricsEnabled': True|False,
              'ResultsS3Prefix': 'string',
              'CompositeRuleEvaluationMethod': 'COLUMN'|'ROW'
          },
          RulesetNames=[
              'string',
          ],
          AdditionalDataSources={
              'string': {
                  'GlueTable': {
                      'DatabaseName': 'string',
                      'TableName': 'string',
                      'CatalogId': 'string',
                      'ConnectionName': 'string',
                      'AdditionalOptions': {
                          'string': 'string'
                      }
                  }
              }
          }
      )

   Parameters:
      * **DataSource** (*dict*) --

        **[REQUIRED]**

        The data source (Glue table) associated with this run.

        * **GlueTable** *(dict) --* **[REQUIRED]**

          An Glue table.

          * **DatabaseName** *(string) --* **[REQUIRED]**

            A database name in the Glue Data Catalog.

          * **TableName** *(string) --* **[REQUIRED]**

            A table name in the Glue Data Catalog.

          * **CatalogId** *(string) --*

            A unique identifier for the Glue Data Catalog.

          * **ConnectionName** *(string) --*

            The name of the connection to the Glue Data Catalog.

          * **AdditionalOptions** *(dict) --*

            Additional options for the table. Currently there are two
            keys supported:

            * "pushDownPredicate": to filter on partitions without
              having to list and read all the files in your dataset.

            * "catalogPartitionPredicate": to use server-side
              partition pruning using partition indexes in the Glue
              Data Catalog.

            * *(string) --*

              * *(string) --*

      * **Role** (*string*) --

        **[REQUIRED]**

        An IAM role supplied to encrypt the results of the run.

      * **NumberOfWorkers** (*integer*) -- The number of "G.1X"
        workers to be used in the run. The default is 5.

      * **Timeout** (*integer*) -- The timeout for a run in minutes.
        This is the maximum time that a run can consume resources
        before it is terminated and enters "TIMEOUT" status. The
        default is 2,880 minutes (48 hours).

      * **ClientToken** (*string*) -- Used for idempotency and is
        recommended to be set to a random ID (such as a UUID) to avoid
        creating or starting multiple instances of the same resource.

      * **AdditionalRunOptions** (*dict*) --

        Additional run options you can specify for an evaluation run.

        * **CloudWatchMetricsEnabled** *(boolean) --*

          Whether or not to enable CloudWatch metrics.

        * **ResultsS3Prefix** *(string) --*

          Prefix for Amazon S3 to store results.

        * **CompositeRuleEvaluationMethod** *(string) --*

          Set the evaluation method for composite rules in the ruleset
          to ROW/COLUMN

      * **RulesetNames** (*list*) --

        **[REQUIRED]**

        A list of ruleset names.

        * *(string) --*

      * **AdditionalDataSources** (*dict*) --

        A map of reference strings to additional data sources you can
        specify for an evaluation run.

        * *(string) --*

          * *(dict) --*

            A data source (an Glue table) for which you want data
            quality results.

            * **GlueTable** *(dict) --* **[REQUIRED]**

              An Glue table.

              * **DatabaseName** *(string) --* **[REQUIRED]**

                A database name in the Glue Data Catalog.

              * **TableName** *(string) --* **[REQUIRED]**

                A table name in the Glue Data Catalog.

              * **CatalogId** *(string) --*

                A unique identifier for the Glue Data Catalog.

              * **ConnectionName** *(string) --*

                The name of the connection to the Glue Data Catalog.

              * **AdditionalOptions** *(dict) --*

                Additional options for the table. Currently there are
                two keys supported:

                * "pushDownPredicate": to filter on partitions without
                  having to list and read all the files in your
                  dataset.

                * "catalogPartitionPredicate": to use server-side
                  partition pruning using partition indexes in the
                  Glue Data Catalog.

                * *(string) --*

                  * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **RunId** *(string) --*

          The unique run identifier associated with this run.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ConflictException"
Glue / Client / get_column_statistics_task_run


get_column_statistics_task_run
******************************

Glue.Client.get_column_statistics_task_run(**kwargs)

   Get the associated metadata/information for a task run, given a
   task run ID.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_column_statistics_task_run(
          ColumnStatisticsTaskRunId='string'
      )

   Parameters:
      **ColumnStatisticsTaskRunId** (*string*) --

      **[REQUIRED]**

      The identifier for the particular column statistics task run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ColumnStatisticsTaskRun': {
                 'CustomerId': 'string',
                 'ColumnStatisticsTaskRunId': 'string',
                 'DatabaseName': 'string',
                 'TableName': 'string',
                 'ColumnNameList': [
                     'string',
                 ],
                 'CatalogID': 'string',
                 'Role': 'string',
                 'SampleSize': 123.0,
                 'SecurityConfiguration': 'string',
                 'NumberOfWorkers': 123,
                 'WorkerType': 'string',
                 'ComputationType': 'FULL'|'INCREMENTAL',
                 'Status': 'STARTING'|'RUNNING'|'SUCCEEDED'|'FAILED'|'STOPPED',
                 'CreationTime': datetime(2015, 1, 1),
                 'LastUpdated': datetime(2015, 1, 1),
                 'StartTime': datetime(2015, 1, 1),
                 'EndTime': datetime(2015, 1, 1),
                 'ErrorMessage': 'string',
                 'DPUSeconds': 123.0
             }
         }

      **Response Structure**

      * *(dict) --*

        * **ColumnStatisticsTaskRun** *(dict) --*

          A "ColumnStatisticsTaskRun" object representing the details
          of the column stats run.

          * **CustomerId** *(string) --*

            The Amazon Web Services account ID.

          * **ColumnStatisticsTaskRunId** *(string) --*

            The identifier for the particular column statistics task
            run.

          * **DatabaseName** *(string) --*

            The database where the table resides.

          * **TableName** *(string) --*

            The name of the table for which column statistics is
            generated.

          * **ColumnNameList** *(list) --*

            A list of the column names. If none is supplied, all
            column names for the table will be used by default.

            * *(string) --*

          * **CatalogID** *(string) --*

            The ID of the Data Catalog where the table resides. If
            none is supplied, the Amazon Web Services account ID is
            used by default.

          * **Role** *(string) --*

            The IAM role that the service assumes to generate
            statistics.

          * **SampleSize** *(float) --*

            The percentage of rows used to generate statistics. If
            none is supplied, the entire table will be used to
            generate stats.

          * **SecurityConfiguration** *(string) --*

            Name of the security configuration that is used to encrypt
            CloudWatch logs for the column stats task run.

          * **NumberOfWorkers** *(integer) --*

            The number of workers used to generate column statistics.
            The job is preconfigured to autoscale up to 25 instances.

          * **WorkerType** *(string) --*

            The type of workers being used for generating stats. The
            default is "g.1x".

          * **ComputationType** *(string) --*

            The type of column statistics computation.

          * **Status** *(string) --*

            The status of the task run.

          * **CreationTime** *(datetime) --*

            The time that this task was created.

          * **LastUpdated** *(datetime) --*

            The last point in time when this task was modified.

          * **StartTime** *(datetime) --*

            The start time of the task.

          * **EndTime** *(datetime) --*

            The end time of the task.

          * **ErrorMessage** *(string) --*

            The error message for the job.

          * **DPUSeconds** *(float) --*

            The calculated DPU usage in seconds for all autoscaled
            workers.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / delete_schema_versions


delete_schema_versions
**********************

Glue.Client.delete_schema_versions(**kwargs)

   Remove versions from the specified schema. A version number or
   range may be supplied. If the compatibility mode forbids deleting
   of a version that is necessary, such as BACKWARDS_FULL, an error is
   returned. Calling the "GetSchemaVersions" API after this call will
   list the status of the deleted versions.

   When the range of version numbers contain check pointed version,
   the API will return a 409 conflict and will not proceed with the
   deletion. You have to remove the checkpoint first using the
   "DeleteSchemaCheckpoint" API before using this API.

   You cannot use the "DeleteSchemaVersions" API to delete the first
   schema version in the schema set. The first schema version can only
   be deleted by the "DeleteSchema" API. This operation will also
   delete the attached "SchemaVersionMetadata" under the schema
   versions. Hard deletes will be enforced on the database.

   If the compatibility mode forbids deleting of a version that is
   necessary, such as BACKWARDS_FULL, an error is returned.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_schema_versions(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          Versions='string'
      )

   Parameters:
      * **SchemaId** (*dict*) --

        **[REQUIRED]**

        This is a wrapper structure that may contain the schema name
        and Amazon Resource Name (ARN).

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **Versions** (*string*) --

        **[REQUIRED]**

        A version range may be supplied which may be of the format:

        * a single version number, 5

        * a range, 5-8 : deletes versions 5, 6, 7, 8

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SchemaVersionErrors': [
                 {
                     'VersionNumber': 123,
                     'ErrorDetails': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **SchemaVersionErrors** *(list) --*

          A list of "SchemaVersionErrorItem" objects, each containing
          an error and schema version.

          * *(dict) --*

            An object that contains the error details for an operation
            on a schema version.

            * **VersionNumber** *(integer) --*

              The version number of the schema.

            * **ErrorDetails** *(dict) --*

              The details of the error for the schema version.

              * **ErrorCode** *(string) --*

                The error code for an error.

              * **ErrorMessage** *(string) --*

                The error message for an error.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / update_integration_resource_property


update_integration_resource_property
************************************

Glue.Client.update_integration_resource_property(**kwargs)

   This API can be used for updating the "ResourceProperty" of the
   Glue connection (for the source) or Glue database ARN (for the
   target). These properties can include the role to access the
   connection or database. Since the same resource can be used across
   multiple integrations, updating resource properties will impact all
   the integrations using it.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_integration_resource_property(
          ResourceArn='string',
          SourceProcessingProperties={
              'RoleArn': 'string'
          },
          TargetProcessingProperties={
              'RoleArn': 'string',
              'KmsArn': 'string',
              'ConnectionName': 'string',
              'EventBusArn': 'string'
          }
      )

   Parameters:
      * **ResourceArn** (*string*) --

        **[REQUIRED]**

        The connection ARN of the source, or the database ARN of the
        target.

      * **SourceProcessingProperties** (*dict*) --

        The resource properties associated with the integration
        source.

        * **RoleArn** *(string) --*

          The IAM role to access the Glue connection.

      * **TargetProcessingProperties** (*dict*) --

        The resource properties associated with the integration
        target.

        * **RoleArn** *(string) --*

          The IAM role to access the Glue database.

        * **KmsArn** *(string) --*

          The ARN of the KMS key used for encryption.

        * **ConnectionName** *(string) --*

          The Glue network connection to configure the Glue job
          running in the customer VPC.

        * **EventBusArn** *(string) --*

          The ARN of an Eventbridge event bus to receive the
          integration status notification.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ResourceArn': 'string',
             'SourceProcessingProperties': {
                 'RoleArn': 'string'
             },
             'TargetProcessingProperties': {
                 'RoleArn': 'string',
                 'KmsArn': 'string',
                 'ConnectionName': 'string',
                 'EventBusArn': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **ResourceArn** *(string) --*

          The connection ARN of the source, or the database ARN of the
          target.

        * **SourceProcessingProperties** *(dict) --*

          The resource properties associated with the integration
          source.

          * **RoleArn** *(string) --*

            The IAM role to access the Glue connection.

        * **TargetProcessingProperties** *(dict) --*

          The resource properties associated with the integration
          target.

          * **RoleArn** *(string) --*

            The IAM role to access the Glue database.

          * **KmsArn** *(string) --*

            The ARN of the KMS key used for encryption.

          * **ConnectionName** *(string) --*

            The Glue network connection to configure the Glue job
            running in the customer VPC.

          * **EventBusArn** *(string) --*

            The ARN of an Eventbridge event bus to receive the
            integration status notification.

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.ResourceNotFoundException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / create_trigger


create_trigger
**************

Glue.Client.create_trigger(**kwargs)

   Creates a new trigger.

   Job arguments may be logged. Do not pass plaintext secrets as
   arguments. Retrieve secrets from a Glue Connection, Amazon Web
   Services Secrets Manager or other secret management mechanism if
   you intend to keep them within the Job.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_trigger(
          Name='string',
          WorkflowName='string',
          Type='SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
          Schedule='string',
          Predicate={
              'Logical': 'AND'|'ANY',
              'Conditions': [
                  {
                      'LogicalOperator': 'EQUALS',
                      'JobName': 'string',
                      'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                      'CrawlerName': 'string',
                      'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                  },
              ]
          },
          Actions=[
              {
                  'JobName': 'string',
                  'Arguments': {
                      'string': 'string'
                  },
                  'Timeout': 123,
                  'SecurityConfiguration': 'string',
                  'NotificationProperty': {
                      'NotifyDelayAfter': 123
                  },
                  'CrawlerName': 'string'
              },
          ],
          Description='string',
          StartOnCreation=True|False,
          Tags={
              'string': 'string'
          },
          EventBatchingCondition={
              'BatchSize': 123,
              'BatchWindow': 123
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the trigger.

      * **WorkflowName** (*string*) -- The name of the workflow
        associated with the trigger.

      * **Type** (*string*) --

        **[REQUIRED]**

        The type of the new trigger.

      * **Schedule** (*string*) --

        A "cron" expression used to specify the schedule (see Time-
        Based Schedules for Jobs and Crawlers. For example, to run
        something every day at 12:15 UTC, you would specify: "cron(15
        12 * * ? *)".

        This field is required when the trigger type is SCHEDULED.

      * **Predicate** (*dict*) --

        A predicate to specify when the new trigger should fire.

        This field is required when the trigger type is "CONDITIONAL".

        * **Logical** *(string) --*

          An optional field if only one condition is listed. If
          multiple conditions are listed, then this field is required.

        * **Conditions** *(list) --*

          A list of the conditions that determine when the trigger
          will fire.

          * *(dict) --*

            Defines a condition under which a trigger fires.

            * **LogicalOperator** *(string) --*

              A logical operator.

            * **JobName** *(string) --*

              The name of the job whose "JobRuns" this condition
              applies to, and on which this trigger waits.

            * **State** *(string) --*

              The condition state. Currently, the only job states that
              a trigger can listen for are "SUCCEEDED", "STOPPED",
              "FAILED", and "TIMEOUT". The only crawler states that a
              trigger can listen for are "SUCCEEDED", "FAILED", and
              "CANCELLED".

            * **CrawlerName** *(string) --*

              The name of the crawler to which this condition applies.

            * **CrawlState** *(string) --*

              The state of the crawler to which this condition
              applies.

      * **Actions** (*list*) --

        **[REQUIRED]**

        The actions initiated by this trigger when it fires.

        * *(dict) --*

          Defines an action to be initiated by a trigger.

          * **JobName** *(string) --*

            The name of a job to be run.

          * **Arguments** *(dict) --*

            The job arguments used when this trigger fires. For this
            job run, they replace the default arguments set in the job
            definition itself.

            You can specify arguments here that your own job-execution
            script consumes, as well as arguments that Glue itself
            consumes.

            For information about how to specify and consume your own
            Job arguments, see the Calling Glue APIs in Python topic
            in the developer guide.

            For information about the key-value pairs that Glue
            consumes to set up your job, see the Special Parameters
            Used by Glue topic in the developer guide.

            * *(string) --*

              * *(string) --*

          * **Timeout** *(integer) --*

            The "JobRun" timeout in minutes. This is the maximum time
            that a job run can consume resources before it is
            terminated and enters "TIMEOUT" status. This overrides the
            timeout value set in the parent job.

            Jobs must have timeout values less than 7 days or 10080
            minutes. Otherwise, the jobs will throw an exception.

            When the value is left blank, the timeout is defaulted to
            2880 minutes.

            Any existing Glue jobs that had a timeout value greater
            than 7 days will be defaulted to 7 days. For instance if
            you have specified a timeout of 20 days for a batch job,
            it will be stopped on the 7th day.

            For streaming jobs, if you have set up a maintenance
            window, it will be restarted during the maintenance window
            after 7 days.

          * **SecurityConfiguration** *(string) --*

            The name of the "SecurityConfiguration" structure to be
            used with this action.

          * **NotificationProperty** *(dict) --*

            Specifies configuration properties of a job run
            notification.

            * **NotifyDelayAfter** *(integer) --*

              After a job run starts, the number of minutes to wait
              before sending a job run delay notification.

          * **CrawlerName** *(string) --*

            The name of the crawler to be used with this action.

      * **Description** (*string*) -- A description of the new
        trigger.

      * **StartOnCreation** (*boolean*) -- Set to "true" to start
        "SCHEDULED" and "CONDITIONAL" triggers when created. True is
        not supported for "ON_DEMAND" triggers.

      * **Tags** (*dict*) --

        The tags to use with this trigger. You may use tags to limit
        access to the trigger. For more information about tags in
        Glue, see Amazon Web Services Tags in Glue in the developer
        guide.

        * *(string) --*

          * *(string) --*

      * **EventBatchingCondition** (*dict*) --

        Batch condition that must be met (specified number of events
        received or batch time window expired) before EventBridge
        event trigger fires.

        * **BatchSize** *(integer) --* **[REQUIRED]**

          Number of events that must be received from Amazon
          EventBridge before EventBridge event trigger fires.

        * **BatchWindow** *(integer) --*

          Window of time in seconds after which EventBridge event
          trigger fires. Window starts when first event is received.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the trigger.

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.IdempotentParameterMismatchException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / get_waiter


get_waiter
**********

Glue.Client.get_waiter(waiter_name)

   Returns an object that can wait for some condition.

   Parameters:
      **waiter_name** (*str*) -- The name of the waiter to get. See
      the waiters section of the service docs for a list of available
      waiters.

   Returns:
      The specified waiter object.

   Return type:
      "botocore.waiter.Waiter"
Glue / Client / stop_crawler


stop_crawler
************

Glue.Client.stop_crawler(**kwargs)

   If the specified crawler is running, stops the crawl.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.stop_crawler(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      Name of the crawler to stop.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.CrawlerNotRunningException"

   * "Glue.Client.exceptions.CrawlerStoppingException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / delete_integration


delete_integration
******************

Glue.Client.delete_integration(**kwargs)

   Deletes the specified Zero-ETL integration.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_integration(
          IntegrationIdentifier='string'
      )

   Parameters:
      **IntegrationIdentifier** (*string*) --

      **[REQUIRED]**

      The Amazon Resource Name (ARN) for the integration.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SourceArn': 'string',
             'TargetArn': 'string',
             'IntegrationName': 'string',
             'Description': 'string',
             'IntegrationArn': 'string',
             'KmsKeyId': 'string',
             'AdditionalEncryptionContext': {
                 'string': 'string'
             },
             'Tags': [
                 {
                     'key': 'string',
                     'value': 'string'
                 },
             ],
             'Status': 'CREATING'|'ACTIVE'|'MODIFYING'|'FAILED'|'DELETING'|'SYNCING'|'NEEDS_ATTENTION',
             'CreateTime': datetime(2015, 1, 1),
             'Errors': [
                 {
                     'ErrorCode': 'string',
                     'ErrorMessage': 'string'
                 },
             ],
             'DataFilter': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **SourceArn** *(string) --*

          The ARN of the source for the integration.

        * **TargetArn** *(string) --*

          The ARN of the target for the integration.

        * **IntegrationName** *(string) --*

          A unique name for an integration in Glue.

        * **Description** *(string) --*

          A description of the integration.

        * **IntegrationArn** *(string) --*

          The Amazon Resource Name (ARN) for the integration.

        * **KmsKeyId** *(string) --*

          The ARN of a KMS key used for encrypting the channel.

        * **AdditionalEncryptionContext** *(dict) --*

          An optional set of non-secret key–value pairs that contains
          additional contextual information for encryption.

          * *(string) --*

            * *(string) --*

        * **Tags** *(list) --*

          Metadata assigned to the resource consisting of a list of
          key-value pairs.

          * *(dict) --*

            The "Tag" object represents a label that you can assign to
            an Amazon Web Services resource. Each tag consists of a
            key and an optional value, both of which you define.

            For more information about tags, and controlling access to
            resources in Glue, see Amazon Web Services Tags in Glue
            and Specifying Glue Resource ARNs in the developer guide.

            * **key** *(string) --*

              The tag key. The key is required when you create a tag
              on an object. The key is case-sensitive, and must not
              contain the prefix aws.

            * **value** *(string) --*

              The tag value. The value is optional when you create a
              tag on an object. The value is case-sensitive, and must
              not contain the prefix aws.

        * **Status** *(string) --*

          The status of the integration being deleted.

          The possible statuses are:

          * CREATING: The integration is being created.

          * ACTIVE: The integration creation succeeds.

          * MODIFYING: The integration is being modified.

          * FAILED: The integration creation fails.

          * DELETING: The integration is deleted.

          * SYNCING: The integration is synchronizing.

          * NEEDS_ATTENTION: The integration needs attention, such as
            synchronization.

        * **CreateTime** *(datetime) --*

          The time when the integration was created, in UTC.

        * **Errors** *(list) --*

          A list of errors associated with the integration.

          * *(dict) --*

            An error associated with a zero-ETL integration.

            * **ErrorCode** *(string) --*

              The code associated with this error.

            * **ErrorMessage** *(string) --*

              A message describing the error.

        * **DataFilter** *(string) --*

          Selects source tables for the integration using Maxwell
          filter syntax.

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.IntegrationNotFoundFault"

   * "Glue.Client.exceptions.IntegrationConflictOperationFault"

   * "Glue.Client.exceptions.InvalidIntegrationStateFault"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ConflictException"

   * "Glue.Client.exceptions.InvalidStateException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / list_dev_endpoints


list_dev_endpoints
******************

Glue.Client.list_dev_endpoints(**kwargs)

   Retrieves the names of all "DevEndpoint" resources in this Amazon
   Web Services account, or the resources with the specified tag. This
   operation allows you to see which resources are available in your
   account, and their names.

   This operation takes the optional "Tags" field, which you can use
   as a filter on the response so that tagged resources can be
   retrieved as a group. If you choose to use tags filtering, only
   resources with the tag are retrieved.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_dev_endpoints(
          NextToken='string',
          MaxResults=123,
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

      * **Tags** (*dict*) --

        Specifies to return only these tagged resources.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'DevEndpointNames': [
                 'string',
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **DevEndpointNames** *(list) --*

          The names of all the >>``<<DevEndpoint``s in the account, or
          the >>``<<DevEndpoint``s with the specified tags.

          * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if the returned list does not contain
          the last metric available.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / create_table


create_table
************

Glue.Client.create_table(**kwargs)

   Creates a new table definition in the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_table(
          CatalogId='string',
          DatabaseName='string',
          Name='string',
          TableInput={
              'Name': 'string',
              'Description': 'string',
              'Owner': 'string',
              'LastAccessTime': datetime(2015, 1, 1),
              'LastAnalyzedTime': datetime(2015, 1, 1),
              'Retention': 123,
              'StorageDescriptor': {
                  'Columns': [
                      {
                          'Name': 'string',
                          'Type': 'string',
                          'Comment': 'string',
                          'Parameters': {
                              'string': 'string'
                          }
                      },
                  ],
                  'Location': 'string',
                  'AdditionalLocations': [
                      'string',
                  ],
                  'InputFormat': 'string',
                  'OutputFormat': 'string',
                  'Compressed': True|False,
                  'NumberOfBuckets': 123,
                  'SerdeInfo': {
                      'Name': 'string',
                      'SerializationLibrary': 'string',
                      'Parameters': {
                          'string': 'string'
                      }
                  },
                  'BucketColumns': [
                      'string',
                  ],
                  'SortColumns': [
                      {
                          'Column': 'string',
                          'SortOrder': 123
                      },
                  ],
                  'Parameters': {
                      'string': 'string'
                  },
                  'SkewedInfo': {
                      'SkewedColumnNames': [
                          'string',
                      ],
                      'SkewedColumnValues': [
                          'string',
                      ],
                      'SkewedColumnValueLocationMaps': {
                          'string': 'string'
                      }
                  },
                  'StoredAsSubDirectories': True|False,
                  'SchemaReference': {
                      'SchemaId': {
                          'SchemaArn': 'string',
                          'SchemaName': 'string',
                          'RegistryName': 'string'
                      },
                      'SchemaVersionId': 'string',
                      'SchemaVersionNumber': 123
                  }
              },
              'PartitionKeys': [
                  {
                      'Name': 'string',
                      'Type': 'string',
                      'Comment': 'string',
                      'Parameters': {
                          'string': 'string'
                      }
                  },
              ],
              'ViewOriginalText': 'string',
              'ViewExpandedText': 'string',
              'TableType': 'string',
              'Parameters': {
                  'string': 'string'
              },
              'TargetTable': {
                  'CatalogId': 'string',
                  'DatabaseName': 'string',
                  'Name': 'string',
                  'Region': 'string'
              },
              'ViewDefinition': {
                  'IsProtected': True|False,
                  'Definer': 'string',
                  'Representations': [
                      {
                          'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                          'DialectVersion': 'string',
                          'ViewOriginalText': 'string',
                          'ValidationConnection': 'string',
                          'ViewExpandedText': 'string'
                      },
                  ],
                  'SubObjects': [
                      'string',
                  ]
              }
          },
          PartitionIndexes=[
              {
                  'Keys': [
                      'string',
                  ],
                  'IndexName': 'string'
              },
          ],
          TransactionId='string',
          OpenTableFormatInput={
              'IcebergInput': {
                  'MetadataOperation': 'CREATE',
                  'Version': 'string',
                  'CreateIcebergTableInput': {
                      'Location': 'string',
                      'Schema': {
                          'SchemaId': 123,
                          'IdentifierFieldIds': [
                              123,
                          ],
                          'Type': 'struct',
                          'Fields': [
                              {
                                  'Id': 123,
                                  'Name': 'string',
                                  'Type': {...}|[...]|123|123.4|'string'|True|None,
                                  'Required': True|False,
                                  'Doc': 'string'
                              },
                          ]
                      },
                      'PartitionSpec': {
                          'Fields': [
                              {
                                  'SourceId': 123,
                                  'Transform': 'string',
                                  'Name': 'string',
                                  'FieldId': 123
                              },
                          ],
                          'SpecId': 123
                      },
                      'WriteOrder': {
                          'OrderId': 123,
                          'Fields': [
                              {
                                  'SourceId': 123,
                                  'Transform': 'string',
                                  'Direction': 'asc'|'desc',
                                  'NullOrder': 'nulls-first'|'nulls-last'
                              },
                          ]
                      },
                      'Properties': {
                          'string': 'string'
                      }
                  }
              }
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which to create the "Table". If none is supplied, the Amazon
        Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The catalog database in which to create the new table. For
        Hive compatibility, this name is entirely lowercase.

      * **Name** (*string*) -- The unique identifier for the table
        within the specified database that will be created in the Glue
        Data Catalog.

      * **TableInput** (*dict*) --

        The "TableInput" object that defines the metadata table to
        create in the catalog.

        * **Name** *(string) --* **[REQUIRED]**

          The table name. For Hive compatibility, this is folded to
          lowercase when it is stored.

        * **Description** *(string) --*

          A description of the table.

        * **Owner** *(string) --*

          The table owner. Included for Apache Hive compatibility. Not
          used in the normal course of Glue operations.

        * **LastAccessTime** *(datetime) --*

          The last time that the table was accessed.

        * **LastAnalyzedTime** *(datetime) --*

          The last time that column statistics were computed for this
          table.

        * **Retention** *(integer) --*

          The retention time for this table.

        * **StorageDescriptor** *(dict) --*

          A storage descriptor containing information about the
          physical storage of this table.

          * **Columns** *(list) --*

            A list of the "Columns" in the table.

            * *(dict) --*

              A column in a "Table".

              * **Name** *(string) --* **[REQUIRED]**

                The name of the "Column".

              * **Type** *(string) --*

                The data type of the "Column".

              * **Comment** *(string) --*

                A free-form text comment.

              * **Parameters** *(dict) --*

                These key-value pairs define properties associated
                with the column.

                * *(string) --*

                  * *(string) --*

          * **Location** *(string) --*

            The physical location of the table. By default, this takes
            the form of the warehouse location, followed by the
            database location in the warehouse, followed by the table
            name.

          * **AdditionalLocations** *(list) --*

            A list of locations that point to the path where a Delta
            table is located.

            * *(string) --*

          * **InputFormat** *(string) --*

            The input format: "SequenceFileInputFormat" (binary), or
            "TextInputFormat", or a custom format.

          * **OutputFormat** *(string) --*

            The output format: "SequenceFileOutputFormat" (binary), or
            "IgnoreKeyTextOutputFormat", or a custom format.

          * **Compressed** *(boolean) --*

            "True" if the data in the table is compressed, or "False"
            if not.

          * **NumberOfBuckets** *(integer) --*

            Must be specified if the table contains any dimension
            columns.

          * **SerdeInfo** *(dict) --*

            The serialization/deserialization (SerDe) information.

            * **Name** *(string) --*

              Name of the SerDe.

            * **SerializationLibrary** *(string) --*

              Usually the class that implements the SerDe. An example
              is
              "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe".

            * **Parameters** *(dict) --*

              These key-value pairs define initialization parameters
              for the SerDe.

              * *(string) --*

                * *(string) --*

          * **BucketColumns** *(list) --*

            A list of reducer grouping columns, clustering columns,
            and bucketing columns in the table.

            * *(string) --*

          * **SortColumns** *(list) --*

            A list specifying the sort order of each bucket in the
            table.

            * *(dict) --*

              Specifies the sort order of a sorted column.

              * **Column** *(string) --* **[REQUIRED]**

                The name of the column.

              * **SortOrder** *(integer) --* **[REQUIRED]**

                Indicates that the column is sorted in ascending order
                ( "== 1"), or in descending order ( "==0").

          * **Parameters** *(dict) --*

            The user-supplied properties in key-value form.

            * *(string) --*

              * *(string) --*

          * **SkewedInfo** *(dict) --*

            The information about values that appear frequently in a
            column (skewed values).

            * **SkewedColumnNames** *(list) --*

              A list of names of columns that contain skewed values.

              * *(string) --*

            * **SkewedColumnValues** *(list) --*

              A list of values that appear so frequently as to be
              considered skewed.

              * *(string) --*

            * **SkewedColumnValueLocationMaps** *(dict) --*

              A mapping of skewed values to the columns that contain
              them.

              * *(string) --*

                * *(string) --*

          * **StoredAsSubDirectories** *(boolean) --*

            "True" if the table data is stored in subdirectories, or
            "False" if not.

          * **SchemaReference** *(dict) --*

            An object that references a schema stored in the Glue
            Schema Registry.

            When creating a table, you can pass an empty list of
            columns for the schema, and instead use a schema
            reference.

            * **SchemaId** *(dict) --*

              A structure that contains schema identity fields. Either
              this or the "SchemaVersionId" has to be provided.

              * **SchemaArn** *(string) --*

                The Amazon Resource Name (ARN) of the schema. One of
                "SchemaArn" or "SchemaName" has to be provided.

              * **SchemaName** *(string) --*

                The name of the schema. One of "SchemaArn" or
                "SchemaName" has to be provided.

              * **RegistryName** *(string) --*

                The name of the schema registry that contains the
                schema.

            * **SchemaVersionId** *(string) --*

              The unique ID assigned to a version of the schema.
              Either this or the "SchemaId" has to be provided.

            * **SchemaVersionNumber** *(integer) --*

              The version number of the schema.

        * **PartitionKeys** *(list) --*

          A list of columns by which the table is partitioned. Only
          primitive types are supported as partition keys.

          When you create a table used by Amazon Athena, and you do
          not specify any "partitionKeys", you must at least set the
          value of "partitionKeys" to an empty list. For example:

          ""PartitionKeys": []"

          * *(dict) --*

            A column in a "Table".

            * **Name** *(string) --* **[REQUIRED]**

              The name of the "Column".

            * **Type** *(string) --*

              The data type of the "Column".

            * **Comment** *(string) --*

              A free-form text comment.

            * **Parameters** *(dict) --*

              These key-value pairs define properties associated with
              the column.

              * *(string) --*

                * *(string) --*

        * **ViewOriginalText** *(string) --*

          Included for Apache Hive compatibility. Not used in the
          normal course of Glue operations. If the table is a
          "VIRTUAL_VIEW", certain Athena configuration encoded in
          base64.

        * **ViewExpandedText** *(string) --*

          Included for Apache Hive compatibility. Not used in the
          normal course of Glue operations.

        * **TableType** *(string) --*

          The type of this table. Glue will create tables with the
          "EXTERNAL_TABLE" type. Other services, such as Athena, may
          create tables with additional table types.

          Glue related table types:

             EXTERNAL_TABLE

          Hive compatible attribute - indicates a non-Hive managed
          table.

             GOVERNED

          Used by Lake Formation. The Glue Data Catalog understands
          "GOVERNED".

        * **Parameters** *(dict) --*

          These key-value pairs define properties associated with the
          table.

          * *(string) --*

            * *(string) --*

        * **TargetTable** *(dict) --*

          A "TableIdentifier" structure that describes a target table
          for resource linking.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the table resides.

          * **DatabaseName** *(string) --*

            The name of the catalog database that contains the target
            table.

          * **Name** *(string) --*

            The name of the target table.

          * **Region** *(string) --*

            Region of the target table.

        * **ViewDefinition** *(dict) --*

          A structure that contains all the information that defines
          the view, including the dialect or dialects for the view,
          and the query.

          * **IsProtected** *(boolean) --*

            You can set this flag as true to instruct the engine not
            to push user-provided operations into the logical plan of
            the view during query planning. However, setting this flag
            does not guarantee that the engine will comply. Refer to
            the engine's documentation to understand the guarantees
            provided, if any.

          * **Definer** *(string) --*

            The definer of a view in SQL.

          * **Representations** *(list) --*

            A list of structures that contains the dialect of the
            view, and the query that defines the view.

            * *(dict) --*

              A structure containing details of a representation to
              update or create a Lake Formation view.

              * **Dialect** *(string) --*

                A parameter that specifies the engine type of a
                specific representation.

              * **DialectVersion** *(string) --*

                A parameter that specifies the version of the engine
                of a specific representation.

              * **ViewOriginalText** *(string) --*

                A string that represents the original SQL query that
                describes the view.

              * **ValidationConnection** *(string) --*

                The name of the connection to be used to validate the
                specific representation of the view.

              * **ViewExpandedText** *(string) --*

                A string that represents the SQL query that describes
                the view with expanded resource ARNs

          * **SubObjects** *(list) --*

            A list of base table ARNs that make up the view.

            * *(string) --*

      * **PartitionIndexes** (*list*) --

        A list of partition indexes, "PartitionIndex" structures, to
        create in the table.

        * *(dict) --*

          A structure for a partition index.

          * **Keys** *(list) --* **[REQUIRED]**

            The keys for the partition index.

            * *(string) --*

          * **IndexName** *(string) --* **[REQUIRED]**

            The name of the partition index.

      * **TransactionId** (*string*) -- The ID of the transaction.

      * **OpenTableFormatInput** (*dict*) --

        Specifies an "OpenTableFormatInput" structure when creating an
        open format table.

        * **IcebergInput** *(dict) --*

          Specifies an "IcebergInput" structure that defines an Apache
          Iceberg metadata table.

          * **MetadataOperation** *(string) --* **[REQUIRED]**

            A required metadata operation. Can only be set to
            "CREATE".

          * **Version** *(string) --*

            The table version for the Iceberg table. Defaults to 2.

          * **CreateIcebergTableInput** *(dict) --*

            The configuration parameters required to create a new
            Iceberg table in the Glue Data Catalog, including table
            properties and metadata specifications.

            * **Location** *(string) --* **[REQUIRED]**

              The S3 location where the Iceberg table data will be
              stored.

            * **Schema** *(dict) --* **[REQUIRED]**

              The schema definition that specifies the structure,
              field types, and metadata for the Iceberg table.

              * **SchemaId** *(integer) --*

                The unique identifier for this schema version within
                the Iceberg table's schema evolution history.

              * **IdentifierFieldIds** *(list) --*

                The list of field identifiers that uniquely identify
                records in the table, used for row-level operations
                and deduplication.

                * *(integer) --*

              * **Type** *(string) --*

                The root type of the schema structure, typically
                "struct" for Iceberg table schemas.

              * **Fields** *(list) --* **[REQUIRED]**

                The list of field definitions that make up the table
                schema, including field names, types, and metadata.

                * *(dict) --*

                  Defines a single field within an Iceberg table
                  schema, including its identifier, name, data type,
                  nullability, and documentation.

                  * **Id** *(integer) --* **[REQUIRED]**

                    The unique identifier assigned to this field
                    within the Iceberg table schema, used for schema
                    evolution and field tracking.

                  * **Name** *(string) --* **[REQUIRED]**

                    The name of the field as it appears in the table
                    schema and query operations.

                  * **Type** (*document*) -- **[REQUIRED]**

                    The data type definition for this field,
                    specifying the structure and format of the data it
                    contains.

                  * **Required** *(boolean) --* **[REQUIRED]**

                    Indicates whether this field is required (non-
                    nullable) or optional (nullable) in the table
                    schema.

                  * **Doc** *(string) --*

                    Optional documentation or description text that
                    provides additional context about the purpose and
                    usage of this field.

            * **PartitionSpec** *(dict) --*

              The partitioning specification that defines how the
              Iceberg table data will be organized and partitioned for
              optimal query performance.

              * **Fields** *(list) --* **[REQUIRED]**

                The list of partition fields that define how the table
                data should be partitioned, including source fields
                and their transformations.

                * *(dict) --*

                  Defines a single partition field within an Iceberg
                  partition specification, including the source field,
                  transformation function, partition name, and unique
                  identifier.

                  * **SourceId** *(integer) --* **[REQUIRED]**

                    The identifier of the source field from the table
                    schema that this partition field is based on.

                  * **Transform** *(string) --* **[REQUIRED]**

                    The transformation function applied to the source
                    field to create the partition, such as identity,
                    bucket, truncate, year, month, day, or hour.

                  * **Name** *(string) --* **[REQUIRED]**

                    The name of the partition field as it will appear
                    in the partitioned table structure.

                  * **FieldId** *(integer) --*

                    The unique identifier assigned to this partition
                    field within the Iceberg table's partition
                    specification.

              * **SpecId** *(integer) --*

                The unique identifier for this partition specification
                within the Iceberg table's metadata history.

            * **WriteOrder** *(dict) --*

              The sort order specification that defines how data
              should be ordered within each partition to optimize
              query performance.

              * **OrderId** *(integer) --* **[REQUIRED]**

                The unique identifier for this sort order
                specification within the Iceberg table's metadata.

              * **Fields** *(list) --* **[REQUIRED]**

                The list of fields and their sort directions that
                define the ordering criteria for the Iceberg table
                data.

                * *(dict) --*

                  Defines a single field within an Iceberg sort order
                  specification, including the source field,
                  transformation, sort direction, and null value
                  ordering.

                  * **SourceId** *(integer) --* **[REQUIRED]**

                    The identifier of the source field from the table
                    schema that this sort field is based on.

                  * **Transform** *(string) --* **[REQUIRED]**

                    The transformation function applied to the source
                    field before sorting, such as identity, bucket, or
                    truncate.

                  * **Direction** *(string) --* **[REQUIRED]**

                    The sort direction for this field, either
                    ascending or descending.

                  * **NullOrder** *(string) --* **[REQUIRED]**

                    The ordering behavior for null values in this
                    field, specifying whether nulls should appear
                    first or last in the sort order.

            * **Properties** *(dict) --*

              Key-value pairs of additional table properties and
              configuration settings for the Iceberg table.

              * *(string) --*

                * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.ResourceNotReadyException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / get_job_bookmark


get_job_bookmark
****************

Glue.Client.get_job_bookmark(**kwargs)

   Returns information on a job bookmark entry.

   For more information about enabling and using job bookmarks, see:

   * Tracking processed data using job bookmarks

   * Job parameters used by Glue

   * Job structure

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_job_bookmark(
          JobName='string',
          RunId='string'
      )

   Parameters:
      * **JobName** (*string*) --

        **[REQUIRED]**

        The name of the job in question.

      * **RunId** (*string*) -- The unique run identifier associated
        with this job run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobBookmarkEntry': {
                 'JobName': 'string',
                 'Version': 123,
                 'Run': 123,
                 'Attempt': 123,
                 'PreviousRunId': 'string',
                 'RunId': 'string',
                 'JobBookmark': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **JobBookmarkEntry** *(dict) --*

          A structure that defines a point that a job can resume
          processing.

          * **JobName** *(string) --*

            The name of the job in question.

          * **Version** *(integer) --*

            The version of the job.

          * **Run** *(integer) --*

            The run ID number.

          * **Attempt** *(integer) --*

            The attempt ID number.

          * **PreviousRunId** *(string) --*

            The unique run identifier associated with the previous job
            run.

          * **RunId** *(string) --*

            The run ID number.

          * **JobBookmark** *(string) --*

            The bookmark itself.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ValidationException"
Glue / Client / get_table_versions


get_table_versions
******************

Glue.Client.get_table_versions(**kwargs)

   Retrieves a list of strings that identify available versions of a
   specified table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_table_versions(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the tables reside. If none is provided, the Amazon Web
        Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The database in the catalog in which the table resides. For
        Hive compatibility, this name is entirely lowercase.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table. For Hive compatibility, this name is
        entirely lowercase.

      * **NextToken** (*string*) -- A continuation token, if this is
        not the first call.

      * **MaxResults** (*integer*) -- The maximum number of table
        versions to return in one response.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TableVersions': [
                 {
                     'Table': {
                         'Name': 'string',
                         'DatabaseName': 'string',
                         'Description': 'string',
                         'Owner': 'string',
                         'CreateTime': datetime(2015, 1, 1),
                         'UpdateTime': datetime(2015, 1, 1),
                         'LastAccessTime': datetime(2015, 1, 1),
                         'LastAnalyzedTime': datetime(2015, 1, 1),
                         'Retention': 123,
                         'StorageDescriptor': {
                             'Columns': [
                                 {
                                     'Name': 'string',
                                     'Type': 'string',
                                     'Comment': 'string',
                                     'Parameters': {
                                         'string': 'string'
                                     }
                                 },
                             ],
                             'Location': 'string',
                             'AdditionalLocations': [
                                 'string',
                             ],
                             'InputFormat': 'string',
                             'OutputFormat': 'string',
                             'Compressed': True|False,
                             'NumberOfBuckets': 123,
                             'SerdeInfo': {
                                 'Name': 'string',
                                 'SerializationLibrary': 'string',
                                 'Parameters': {
                                     'string': 'string'
                                 }
                             },
                             'BucketColumns': [
                                 'string',
                             ],
                             'SortColumns': [
                                 {
                                     'Column': 'string',
                                     'SortOrder': 123
                                 },
                             ],
                             'Parameters': {
                                 'string': 'string'
                             },
                             'SkewedInfo': {
                                 'SkewedColumnNames': [
                                     'string',
                                 ],
                                 'SkewedColumnValues': [
                                     'string',
                                 ],
                                 'SkewedColumnValueLocationMaps': {
                                     'string': 'string'
                                 }
                             },
                             'StoredAsSubDirectories': True|False,
                             'SchemaReference': {
                                 'SchemaId': {
                                     'SchemaArn': 'string',
                                     'SchemaName': 'string',
                                     'RegistryName': 'string'
                                 },
                                 'SchemaVersionId': 'string',
                                 'SchemaVersionNumber': 123
                             }
                         },
                         'PartitionKeys': [
                             {
                                 'Name': 'string',
                                 'Type': 'string',
                                 'Comment': 'string',
                                 'Parameters': {
                                     'string': 'string'
                                 }
                             },
                         ],
                         'ViewOriginalText': 'string',
                         'ViewExpandedText': 'string',
                         'TableType': 'string',
                         'Parameters': {
                             'string': 'string'
                         },
                         'CreatedBy': 'string',
                         'IsRegisteredWithLakeFormation': True|False,
                         'TargetTable': {
                             'CatalogId': 'string',
                             'DatabaseName': 'string',
                             'Name': 'string',
                             'Region': 'string'
                         },
                         'CatalogId': 'string',
                         'VersionId': 'string',
                         'FederatedTable': {
                             'Identifier': 'string',
                             'DatabaseIdentifier': 'string',
                             'ConnectionName': 'string',
                             'ConnectionType': 'string'
                         },
                         'ViewDefinition': {
                             'IsProtected': True|False,
                             'Definer': 'string',
                             'SubObjects': [
                                 'string',
                             ],
                             'Representations': [
                                 {
                                     'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                     'DialectVersion': 'string',
                                     'ViewOriginalText': 'string',
                                     'ViewExpandedText': 'string',
                                     'ValidationConnection': 'string',
                                     'IsStale': True|False
                                 },
                             ]
                         },
                         'IsMultiDialectView': True|False,
                         'Status': {
                             'RequestedBy': 'string',
                             'UpdatedBy': 'string',
                             'RequestTime': datetime(2015, 1, 1),
                             'UpdateTime': datetime(2015, 1, 1),
                             'Action': 'UPDATE'|'CREATE',
                             'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                             'Error': {
                                 'ErrorCode': 'string',
                                 'ErrorMessage': 'string'
                             },
                             'Details': {
                                 'RequestedChange': {'... recursive ...'},
                                 'ViewValidations': [
                                     {
                                         'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                         'DialectVersion': 'string',
                                         'ViewValidationText': 'string',
                                         'UpdateTime': datetime(2015, 1, 1),
                                         'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                                         'Error': {
                                             'ErrorCode': 'string',
                                             'ErrorMessage': 'string'
                                         }
                                     },
                                 ]
                             }
                         }
                     },
                     'VersionId': 'string'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TableVersions** *(list) --*

          A list of strings identifying available versions of the
          specified table.

          * *(dict) --*

            Specifies a version of a table.

            * **Table** *(dict) --*

              The table in question.

              * **Name** *(string) --*

                The table name. For Hive compatibility, this must be
                entirely lowercase.

              * **DatabaseName** *(string) --*

                The name of the database where the table metadata
                resides. For Hive compatibility, this must be all
                lowercase.

              * **Description** *(string) --*

                A description of the table.

              * **Owner** *(string) --*

                The owner of the table.

              * **CreateTime** *(datetime) --*

                The time when the table definition was created in the
                Data Catalog.

              * **UpdateTime** *(datetime) --*

                The last time that the table was updated.

              * **LastAccessTime** *(datetime) --*

                The last time that the table was accessed. This is
                usually taken from HDFS, and might not be reliable.

              * **LastAnalyzedTime** *(datetime) --*

                The last time that column statistics were computed for
                this table.

              * **Retention** *(integer) --*

                The retention time for this table.

              * **StorageDescriptor** *(dict) --*

                A storage descriptor containing information about the
                physical storage of this table.

                * **Columns** *(list) --*

                  A list of the "Columns" in the table.

                  * *(dict) --*

                    A column in a "Table".

                    * **Name** *(string) --*

                      The name of the "Column".

                    * **Type** *(string) --*

                      The data type of the "Column".

                    * **Comment** *(string) --*

                      A free-form text comment.

                    * **Parameters** *(dict) --*

                      These key-value pairs define properties
                      associated with the column.

                      * *(string) --*

                        * *(string) --*

                * **Location** *(string) --*

                  The physical location of the table. By default, this
                  takes the form of the warehouse location, followed
                  by the database location in the warehouse, followed
                  by the table name.

                * **AdditionalLocations** *(list) --*

                  A list of locations that point to the path where a
                  Delta table is located.

                  * *(string) --*

                * **InputFormat** *(string) --*

                  The input format: "SequenceFileInputFormat"
                  (binary), or "TextInputFormat", or a custom format.

                * **OutputFormat** *(string) --*

                  The output format: "SequenceFileOutputFormat"
                  (binary), or "IgnoreKeyTextOutputFormat", or a
                  custom format.

                * **Compressed** *(boolean) --*

                  "True" if the data in the table is compressed, or
                  "False" if not.

                * **NumberOfBuckets** *(integer) --*

                  Must be specified if the table contains any
                  dimension columns.

                * **SerdeInfo** *(dict) --*

                  The serialization/deserialization (SerDe)
                  information.

                  * **Name** *(string) --*

                    Name of the SerDe.

                  * **SerializationLibrary** *(string) --*

                    Usually the class that implements the SerDe. An
                    example is "org.apache.hadoop.hive.serde2.columna
                    r.ColumnarSerDe".

                  * **Parameters** *(dict) --*

                    These key-value pairs define initialization
                    parameters for the SerDe.

                    * *(string) --*

                      * *(string) --*

                * **BucketColumns** *(list) --*

                  A list of reducer grouping columns, clustering
                  columns, and bucketing columns in the table.

                  * *(string) --*

                * **SortColumns** *(list) --*

                  A list specifying the sort order of each bucket in
                  the table.

                  * *(dict) --*

                    Specifies the sort order of a sorted column.

                    * **Column** *(string) --*

                      The name of the column.

                    * **SortOrder** *(integer) --*

                      Indicates that the column is sorted in ascending
                      order ( "== 1"), or in descending order (
                      "==0").

                * **Parameters** *(dict) --*

                  The user-supplied properties in key-value form.

                  * *(string) --*

                    * *(string) --*

                * **SkewedInfo** *(dict) --*

                  The information about values that appear frequently
                  in a column (skewed values).

                  * **SkewedColumnNames** *(list) --*

                    A list of names of columns that contain skewed
                    values.

                    * *(string) --*

                  * **SkewedColumnValues** *(list) --*

                    A list of values that appear so frequently as to
                    be considered skewed.

                    * *(string) --*

                  * **SkewedColumnValueLocationMaps** *(dict) --*

                    A mapping of skewed values to the columns that
                    contain them.

                    * *(string) --*

                      * *(string) --*

                * **StoredAsSubDirectories** *(boolean) --*

                  "True" if the table data is stored in
                  subdirectories, or "False" if not.

                * **SchemaReference** *(dict) --*

                  An object that references a schema stored in the
                  Glue Schema Registry.

                  When creating a table, you can pass an empty list of
                  columns for the schema, and instead use a schema
                  reference.

                  * **SchemaId** *(dict) --*

                    A structure that contains schema identity fields.
                    Either this or the "SchemaVersionId" has to be
                    provided.

                    * **SchemaArn** *(string) --*

                      The Amazon Resource Name (ARN) of the schema.
                      One of "SchemaArn" or "SchemaName" has to be
                      provided.

                    * **SchemaName** *(string) --*

                      The name of the schema. One of "SchemaArn" or
                      "SchemaName" has to be provided.

                    * **RegistryName** *(string) --*

                      The name of the schema registry that contains
                      the schema.

                  * **SchemaVersionId** *(string) --*

                    The unique ID assigned to a version of the schema.
                    Either this or the "SchemaId" has to be provided.

                  * **SchemaVersionNumber** *(integer) --*

                    The version number of the schema.

              * **PartitionKeys** *(list) --*

                A list of columns by which the table is partitioned.
                Only primitive types are supported as partition keys.

                When you create a table used by Amazon Athena, and you
                do not specify any "partitionKeys", you must at least
                set the value of "partitionKeys" to an empty list. For
                example:

                ""PartitionKeys": []"

                * *(dict) --*

                  A column in a "Table".

                  * **Name** *(string) --*

                    The name of the "Column".

                  * **Type** *(string) --*

                    The data type of the "Column".

                  * **Comment** *(string) --*

                    A free-form text comment.

                  * **Parameters** *(dict) --*

                    These key-value pairs define properties associated
                    with the column.

                    * *(string) --*

                      * *(string) --*

              * **ViewOriginalText** *(string) --*

                Included for Apache Hive compatibility. Not used in
                the normal course of Glue operations. If the table is
                a "VIRTUAL_VIEW", certain Athena configuration encoded
                in base64.

              * **ViewExpandedText** *(string) --*

                Included for Apache Hive compatibility. Not used in
                the normal course of Glue operations.

              * **TableType** *(string) --*

                The type of this table. Glue will create tables with
                the "EXTERNAL_TABLE" type. Other services, such as
                Athena, may create tables with additional table types.

                Glue related table types:

                   EXTERNAL_TABLE

                Hive compatible attribute - indicates a non-Hive
                managed table.

                   GOVERNED

                Used by Lake Formation. The Glue Data Catalog
                understands "GOVERNED".

              * **Parameters** *(dict) --*

                These key-value pairs define properties associated
                with the table.

                * *(string) --*

                  * *(string) --*

              * **CreatedBy** *(string) --*

                The person or entity who created the table.

              * **IsRegisteredWithLakeFormation** *(boolean) --*

                Indicates whether the table has been registered with
                Lake Formation.

              * **TargetTable** *(dict) --*

                A "TableIdentifier" structure that describes a target
                table for resource linking.

                * **CatalogId** *(string) --*

                  The ID of the Data Catalog in which the table
                  resides.

                * **DatabaseName** *(string) --*

                  The name of the catalog database that contains the
                  target table.

                * **Name** *(string) --*

                  The name of the target table.

                * **Region** *(string) --*

                  Region of the target table.

              * **CatalogId** *(string) --*

                The ID of the Data Catalog in which the table resides.

              * **VersionId** *(string) --*

                The ID of the table version.

              * **FederatedTable** *(dict) --*

                A "FederatedTable" structure that references an entity
                outside the Glue Data Catalog.

                * **Identifier** *(string) --*

                  A unique identifier for the federated table.

                * **DatabaseIdentifier** *(string) --*

                  A unique identifier for the federated database.

                * **ConnectionName** *(string) --*

                  The name of the connection to the external
                  metastore.

                * **ConnectionType** *(string) --*

                  The type of connection used to access the federated
                  table, specifying the protocol or method for
                  connecting to the external data source.

              * **ViewDefinition** *(dict) --*

                A structure that contains all the information that
                defines the view, including the dialect or dialects
                for the view, and the query.

                * **IsProtected** *(boolean) --*

                  You can set this flag as true to instruct the engine
                  not to push user-provided operations into the
                  logical plan of the view during query planning.
                  However, setting this flag does not guarantee that
                  the engine will comply. Refer to the engine's
                  documentation to understand the guarantees provided,
                  if any.

                * **Definer** *(string) --*

                  The definer of a view in SQL.

                * **SubObjects** *(list) --*

                  A list of table Amazon Resource Names (ARNs).

                  * *(string) --*

                * **Representations** *(list) --*

                  A list of representations.

                  * *(dict) --*

                    A structure that contains the dialect of the view,
                    and the query that defines the view.

                    * **Dialect** *(string) --*

                      The dialect of the query engine.

                    * **DialectVersion** *(string) --*

                      The version of the dialect of the query engine.
                      For example, 3.0.0.

                    * **ViewOriginalText** *(string) --*

                      The "SELECT" query provided by the customer
                      during "CREATE VIEW DDL". This SQL is not used
                      during a query on a view ( "ViewExpandedText" is
                      used instead). "ViewOriginalText" is used for
                      cases like "SHOW CREATE VIEW" where users want
                      to see the original DDL command that created the
                      view.

                    * **ViewExpandedText** *(string) --*

                      The expanded SQL for the view. This SQL is used
                      by engines while processing a query on a view.
                      Engines may perform operations during view
                      creation to transform "ViewOriginalText" to
                      "ViewExpandedText". For example:

                      * Fully qualified identifiers: "SELECT * from
                        table1 -> SELECT * from db1.table1"

                    * **ValidationConnection** *(string) --*

                      The name of the connection to be used to
                      validate the specific representation of the
                      view.

                    * **IsStale** *(boolean) --*

                      Dialects marked as stale are no longer valid and
                      must be updated before they can be queried in
                      their respective query engines.

              * **IsMultiDialectView** *(boolean) --*

                Specifies whether the view supports the SQL dialects
                of one or more different query engines and can
                therefore be read by those engines.

              * **Status** *(dict) --*

                A structure containing information about the state of
                an asynchronous change to a table.

                * **RequestedBy** *(string) --*

                  The ARN of the user who requested the asynchronous
                  change.

                * **UpdatedBy** *(string) --*

                  The ARN of the user to last manually alter the
                  asynchronous change (requesting cancellation, etc).

                * **RequestTime** *(datetime) --*

                  An ISO 8601 formatted date string indicating the
                  time that the change was initiated.

                * **UpdateTime** *(datetime) --*

                  An ISO 8601 formatted date string indicating the
                  time that the state was last updated.

                * **Action** *(string) --*

                  Indicates which action was called on the table,
                  currently only "CREATE" or "UPDATE".

                * **State** *(string) --*

                  A generic status for the change in progress, such as
                  QUEUED, IN_PROGRESS, SUCCESS, or FAILED.

                * **Error** *(dict) --*

                  An error that will only appear when the state is
                  "FAILED". This is a parent level exception message,
                  there may be different >>``<<Error``s for each
                  dialect.

                  * **ErrorCode** *(string) --*

                    The code associated with this error.

                  * **ErrorMessage** *(string) --*

                    A message describing the error.

                * **Details** *(dict) --*

                  A "StatusDetails" object with information about the
                  requested change.

                  * **RequestedChange** *(dict) --*

                    A "Table" object representing the requested
                    changes.

                  * **ViewValidations** *(list) --*

                    A list of "ViewValidation" objects that contain
                    information for an analytical engine to validate a
                    view.

                    * *(dict) --*

                      A structure that contains information for an
                      analytical engine to validate a view, prior to
                      persisting the view metadata. Used in the case
                      of direct "UpdateTable" or "CreateTable" API
                      calls.

                      * **Dialect** *(string) --*

                        The dialect of the query engine.

                      * **DialectVersion** *(string) --*

                        The version of the dialect of the query
                        engine. For example, 3.0.0.

                      * **ViewValidationText** *(string) --*

                        The "SELECT" query that defines the view, as
                        provided by the customer.

                      * **UpdateTime** *(datetime) --*

                        The time of the last update.

                      * **State** *(string) --*

                        The state of the validation.

                      * **Error** *(dict) --*

                        An error associated with the validation.

                        * **ErrorCode** *(string) --*

                          The code associated with this error.

                        * **ErrorMessage** *(string) --*

                          A message describing the error.

            * **VersionId** *(string) --*

              The ID value that identifies this table version. A
              "VersionId" is a string representation of an integer.
              Each version is incremented by 1.

        * **NextToken** *(string) --*

          A continuation token, if the list of available versions does
          not include the last one.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_data_quality_rule_recommendation_run


get_data_quality_rule_recommendation_run
****************************************

Glue.Client.get_data_quality_rule_recommendation_run(**kwargs)

   Gets the specified recommendation run that was used to generate
   rules.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_data_quality_rule_recommendation_run(
          RunId='string'
      )

   Parameters:
      **RunId** (*string*) --

      **[REQUIRED]**

      The unique run identifier associated with this run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RunId': 'string',
             'DataSource': {
                 'GlueTable': {
                     'DatabaseName': 'string',
                     'TableName': 'string',
                     'CatalogId': 'string',
                     'ConnectionName': 'string',
                     'AdditionalOptions': {
                         'string': 'string'
                     }
                 }
             },
             'Role': 'string',
             'NumberOfWorkers': 123,
             'Timeout': 123,
             'Status': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT',
             'ErrorString': 'string',
             'StartedOn': datetime(2015, 1, 1),
             'LastModifiedOn': datetime(2015, 1, 1),
             'CompletedOn': datetime(2015, 1, 1),
             'ExecutionTime': 123,
             'RecommendedRuleset': 'string',
             'CreatedRulesetName': 'string',
             'DataQualitySecurityConfiguration': 'string'
         }

      **Response Structure**

      * *(dict) --*

        The response for the Data Quality rule recommendation run.

        * **RunId** *(string) --*

          The unique run identifier associated with this run.

        * **DataSource** *(dict) --*

          The data source (an Glue table) associated with this run.

          * **GlueTable** *(dict) --*

            An Glue table.

            * **DatabaseName** *(string) --*

              A database name in the Glue Data Catalog.

            * **TableName** *(string) --*

              A table name in the Glue Data Catalog.

            * **CatalogId** *(string) --*

              A unique identifier for the Glue Data Catalog.

            * **ConnectionName** *(string) --*

              The name of the connection to the Glue Data Catalog.

            * **AdditionalOptions** *(dict) --*

              Additional options for the table. Currently there are
              two keys supported:

              * "pushDownPredicate": to filter on partitions without
                having to list and read all the files in your dataset.

              * "catalogPartitionPredicate": to use server-side
                partition pruning using partition indexes in the Glue
                Data Catalog.

              * *(string) --*

                * *(string) --*

        * **Role** *(string) --*

          An IAM role supplied to encrypt the results of the run.

        * **NumberOfWorkers** *(integer) --*

          The number of "G.1X" workers to be used in the run. The
          default is 5.

        * **Timeout** *(integer) --*

          The timeout for a run in minutes. This is the maximum time
          that a run can consume resources before it is terminated and
          enters "TIMEOUT" status. The default is 2,880 minutes (48
          hours).

        * **Status** *(string) --*

          The status for this run.

        * **ErrorString** *(string) --*

          The error strings that are associated with the run.

        * **StartedOn** *(datetime) --*

          The date and time when this run started.

        * **LastModifiedOn** *(datetime) --*

          A timestamp. The last point in time when this data quality
          rule recommendation run was modified.

        * **CompletedOn** *(datetime) --*

          The date and time when this run was completed.

        * **ExecutionTime** *(integer) --*

          The amount of time (in seconds) that the run consumed
          resources.

        * **RecommendedRuleset** *(string) --*

          When a start rule recommendation run completes, it creates a
          recommended ruleset (a set of rules). This member has those
          rules in Data Quality Definition Language (DQDL) format.

        * **CreatedRulesetName** *(string) --*

          The name of the ruleset that was created by the run.

        * **DataQualitySecurityConfiguration** *(string) --*

          The name of the security configuration created with the data
          quality encryption option.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / list_custom_entity_types


list_custom_entity_types
************************

Glue.Client.list_custom_entity_types(**kwargs)

   Lists all the custom patterns that have been created.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_custom_entity_types(
          NextToken='string',
          MaxResults=123,
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **NextToken** (*string*) -- A paginated token to offset the
        results.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return.

      * **Tags** (*dict*) --

        A list of key-value pair tags.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'CustomEntityTypes': [
                 {
                     'Name': 'string',
                     'RegexString': 'string',
                     'ContextWords': [
                         'string',
                     ]
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **CustomEntityTypes** *(list) --*

          A list of "CustomEntityType" objects representing custom
          patterns.

          * *(dict) --*

            An object representing a custom pattern for detecting
            sensitive data across the columns and rows of your
            structured data.

            * **Name** *(string) --*

              A name for the custom pattern that allows it to be
              retrieved or deleted later. This name must be unique per
              Amazon Web Services account.

            * **RegexString** *(string) --*

              A regular expression string that is used for detecting
              sensitive data in a custom pattern.

            * **ContextWords** *(list) --*

              A list of context words. If none of these context words
              are found within the vicinity of the regular expression
              the data will not be detected as sensitive data.

              If no context words are passed only a regular expression
              is checked.

              * *(string) --*

        * **NextToken** *(string) --*

          A pagination token, if more results are available.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / update_usage_profile


update_usage_profile
********************

Glue.Client.update_usage_profile(**kwargs)

   Update an Glue usage profile.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_usage_profile(
          Name='string',
          Description='string',
          Configuration={
              'SessionConfiguration': {
                  'string': {
                      'DefaultValue': 'string',
                      'AllowedValues': [
                          'string',
                      ],
                      'MinValue': 'string',
                      'MaxValue': 'string'
                  }
              },
              'JobConfiguration': {
                  'string': {
                      'DefaultValue': 'string',
                      'AllowedValues': [
                          'string',
                      ],
                      'MinValue': 'string',
                      'MaxValue': 'string'
                  }
              }
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the usage profile.

      * **Description** (*string*) -- A description of the usage
        profile.

      * **Configuration** (*dict*) --

        **[REQUIRED]**

        A "ProfileConfiguration" object specifying the job and session
        values for the profile.

        * **SessionConfiguration** *(dict) --*

          A key-value map of configuration parameters for Glue
          sessions.

          * *(string) --*

            * *(dict) --*

              Specifies the values that an admin sets for each job or
              session parameter configured in a Glue usage profile.

              * **DefaultValue** *(string) --*

                A default value for the parameter.

              * **AllowedValues** *(list) --*

                A list of allowed values for the parameter.

                * *(string) --*

              * **MinValue** *(string) --*

                A minimum allowed value for the parameter.

              * **MaxValue** *(string) --*

                A maximum allowed value for the parameter.

        * **JobConfiguration** *(dict) --*

          A key-value map of configuration parameters for Glue jobs.

          * *(string) --*

            * *(dict) --*

              Specifies the values that an admin sets for each job or
              session parameter configured in a Glue usage profile.

              * **DefaultValue** *(string) --*

                A default value for the parameter.

              * **AllowedValues** *(list) --*

                A list of allowed values for the parameter.

                * *(string) --*

              * **MinValue** *(string) --*

                A minimum allowed value for the parameter.

              * **MaxValue** *(string) --*

                A maximum allowed value for the parameter.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the usage profile that was updated.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.OperationNotSupportedException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / create_integration


create_integration
******************

Glue.Client.create_integration(**kwargs)

   Creates a Zero-ETL integration in the caller's account between two
   resources with Amazon Resource Names (ARNs): the "SourceArn" and
   "TargetArn".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_integration(
          IntegrationName='string',
          SourceArn='string',
          TargetArn='string',
          Description='string',
          DataFilter='string',
          KmsKeyId='string',
          AdditionalEncryptionContext={
              'string': 'string'
          },
          Tags=[
              {
                  'key': 'string',
                  'value': 'string'
              },
          ],
          IntegrationConfig={
              'RefreshInterval': 'string',
              'SourceProperties': {
                  'string': 'string'
              }
          }
      )

   Parameters:
      * **IntegrationName** (*string*) --

        **[REQUIRED]**

        A unique name for an integration in Glue.

      * **SourceArn** (*string*) --

        **[REQUIRED]**

        The ARN of the source resource for the integration.

      * **TargetArn** (*string*) --

        **[REQUIRED]**

        The ARN of the target resource for the integration.

      * **Description** (*string*) -- A description of the
        integration.

      * **DataFilter** (*string*) -- Selects source tables for the
        integration using Maxwell filter syntax.

      * **KmsKeyId** (*string*) -- The ARN of a KMS key used for
        encrypting the channel.

      * **AdditionalEncryptionContext** (*dict*) --

        An optional set of non-secret key–value pairs that contains
        additional contextual information for encryption. This can
        only be provided if "KMSKeyId" is provided.

        * *(string) --*

          * *(string) --*

      * **Tags** (*list*) --

        Metadata assigned to the resource consisting of a list of key-
        value pairs.

        * *(dict) --*

          The "Tag" object represents a label that you can assign to
          an Amazon Web Services resource. Each tag consists of a key
          and an optional value, both of which you define.

          For more information about tags, and controlling access to
          resources in Glue, see Amazon Web Services Tags in Glue and
          Specifying Glue Resource ARNs in the developer guide.

          * **key** *(string) --*

            The tag key. The key is required when you create a tag on
            an object. The key is case-sensitive, and must not contain
            the prefix aws.

          * **value** *(string) --*

            The tag value. The value is optional when you create a tag
            on an object. The value is case-sensitive, and must not
            contain the prefix aws.

      * **IntegrationConfig** (*dict*) --

        The configuration settings.

        * **RefreshInterval** *(string) --*

          Specifies the frequency at which CDC (Change Data Capture)
          pulls or incremental loads should occur. This parameter
          provides flexibility to align the refresh rate with your
          specific data update patterns, system load considerations,
          and performance optimization goals. Time increment can be
          set from 15 minutes to 8640 minutes (six days). Currently
          supports creation of "RefreshInterval" only.

        * **SourceProperties** *(dict) --*

          A collection of key-value pairs that specify additional
          properties for the integration source. These properties
          provide configuration options that can be used to customize
          the behavior of the ODB source during data integration
          operations.

          * *(string) --*

            * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SourceArn': 'string',
             'TargetArn': 'string',
             'IntegrationName': 'string',
             'Description': 'string',
             'IntegrationArn': 'string',
             'KmsKeyId': 'string',
             'AdditionalEncryptionContext': {
                 'string': 'string'
             },
             'Tags': [
                 {
                     'key': 'string',
                     'value': 'string'
                 },
             ],
             'Status': 'CREATING'|'ACTIVE'|'MODIFYING'|'FAILED'|'DELETING'|'SYNCING'|'NEEDS_ATTENTION',
             'CreateTime': datetime(2015, 1, 1),
             'Errors': [
                 {
                     'ErrorCode': 'string',
                     'ErrorMessage': 'string'
                 },
             ],
             'DataFilter': 'string',
             'IntegrationConfig': {
                 'RefreshInterval': 'string',
                 'SourceProperties': {
                     'string': 'string'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **SourceArn** *(string) --*

          The ARN of the source resource for the integration.

        * **TargetArn** *(string) --*

          The ARN of the target resource for the integration.

        * **IntegrationName** *(string) --*

          A unique name for an integration in Glue.

        * **Description** *(string) --*

          A description of the integration.

        * **IntegrationArn** *(string) --*

          The Amazon Resource Name (ARN) for the created integration.

        * **KmsKeyId** *(string) --*

          The ARN of a KMS key used for encrypting the channel.

        * **AdditionalEncryptionContext** *(dict) --*

          An optional set of non-secret key–value pairs that contains
          additional contextual information for encryption.

          * *(string) --*

            * *(string) --*

        * **Tags** *(list) --*

          Metadata assigned to the resource consisting of a list of
          key-value pairs.

          * *(dict) --*

            The "Tag" object represents a label that you can assign to
            an Amazon Web Services resource. Each tag consists of a
            key and an optional value, both of which you define.

            For more information about tags, and controlling access to
            resources in Glue, see Amazon Web Services Tags in Glue
            and Specifying Glue Resource ARNs in the developer guide.

            * **key** *(string) --*

              The tag key. The key is required when you create a tag
              on an object. The key is case-sensitive, and must not
              contain the prefix aws.

            * **value** *(string) --*

              The tag value. The value is optional when you create a
              tag on an object. The value is case-sensitive, and must
              not contain the prefix aws.

        * **Status** *(string) --*

          The status of the integration being created.

          The possible statuses are:

          * CREATING: The integration is being created.

          * ACTIVE: The integration creation succeeds.

          * MODIFYING: The integration is being modified.

          * FAILED: The integration creation fails.

          * DELETING: The integration is deleted.

          * SYNCING: The integration is synchronizing.

          * NEEDS_ATTENTION: The integration needs attention, such as
            synchronization.

        * **CreateTime** *(datetime) --*

          The time when the integration was created, in UTC.

        * **Errors** *(list) --*

          A list of errors associated with the integration creation.

          * *(dict) --*

            An error associated with a zero-ETL integration.

            * **ErrorCode** *(string) --*

              The code associated with this error.

            * **ErrorMessage** *(string) --*

              A message describing the error.

        * **DataFilter** *(string) --*

          Selects source tables for the integration using Maxwell
          filter syntax.

        * **IntegrationConfig** *(dict) --*

          The configuration settings.

          * **RefreshInterval** *(string) --*

            Specifies the frequency at which CDC (Change Data Capture)
            pulls or incremental loads should occur. This parameter
            provides flexibility to align the refresh rate with your
            specific data update patterns, system load considerations,
            and performance optimization goals. Time increment can be
            set from 15 minutes to 8640 minutes (six days). Currently
            supports creation of "RefreshInterval" only.

          * **SourceProperties** *(dict) --*

            A collection of key-value pairs that specify additional
            properties for the integration source. These properties
            provide configuration options that can be used to
            customize the behavior of the ODB source during data
            integration operations.

            * *(string) --*

              * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ResourceNotFoundException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.IntegrationConflictOperationFault"

   * "Glue.Client.exceptions.IntegrationQuotaExceededFault"

   * "Glue.Client.exceptions.KMSKeyNotAccessibleFault"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ConflictException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / update_column_statistics_for_partition


update_column_statistics_for_partition
**************************************

Glue.Client.update_column_statistics_for_partition(**kwargs)

   Creates or updates partition statistics of columns.

   The Identity and Access Management (IAM) permission required for
   this operation is "UpdatePartition".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_column_statistics_for_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionValues=[
              'string',
          ],
          ColumnStatisticsList=[
              {
                  'ColumnName': 'string',
                  'ColumnType': 'string',
                  'AnalyzedTime': datetime(2015, 1, 1),
                  'StatisticsData': {
                      'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                      'BooleanColumnStatisticsData': {
                          'NumberOfTrues': 123,
                          'NumberOfFalses': 123,
                          'NumberOfNulls': 123
                      },
                      'DateColumnStatisticsData': {
                          'MinimumValue': datetime(2015, 1, 1),
                          'MaximumValue': datetime(2015, 1, 1),
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'DecimalColumnStatisticsData': {
                          'MinimumValue': {
                              'UnscaledValue': b'bytes',
                              'Scale': 123
                          },
                          'MaximumValue': {
                              'UnscaledValue': b'bytes',
                              'Scale': 123
                          },
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'DoubleColumnStatisticsData': {
                          'MinimumValue': 123.0,
                          'MaximumValue': 123.0,
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'LongColumnStatisticsData': {
                          'MinimumValue': 123,
                          'MaximumValue': 123,
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'StringColumnStatisticsData': {
                          'MaximumLength': 123,
                          'AverageLength': 123.0,
                          'NumberOfNulls': 123,
                          'NumberOfDistinctValues': 123
                      },
                      'BinaryColumnStatisticsData': {
                          'MaximumLength': 123,
                          'AverageLength': 123.0,
                          'NumberOfNulls': 123
                      }
                  }
              },
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partitions in question reside. If none is supplied, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partitions reside.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the partitions' table.

      * **PartitionValues** (*list*) --

        **[REQUIRED]**

        A list of partition values identifying the partition.

        * *(string) --*

      * **ColumnStatisticsList** (*list*) --

        **[REQUIRED]**

        A list of the column statistics.

        * *(dict) --*

          Represents the generated column-level statistics for a table
          or partition.

          * **ColumnName** *(string) --* **[REQUIRED]**

            Name of column which statistics belong to.

          * **ColumnType** *(string) --* **[REQUIRED]**

            The data type of the column.

          * **AnalyzedTime** *(datetime) --* **[REQUIRED]**

            The timestamp of when column statistics were generated.

          * **StatisticsData** *(dict) --* **[REQUIRED]**

            A "ColumnStatisticData" object that contains the
            statistics data values.

            * **Type** *(string) --* **[REQUIRED]**

              The type of column statistics data.

            * **BooleanColumnStatisticsData** *(dict) --*

              Boolean column statistics data.

              * **NumberOfTrues** *(integer) --* **[REQUIRED]**

                The number of true values in the column.

              * **NumberOfFalses** *(integer) --* **[REQUIRED]**

                The number of false values in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

            * **DateColumnStatisticsData** *(dict) --*

              Date column statistics data.

              * **MinimumValue** *(datetime) --*

                The lowest value in the column.

              * **MaximumValue** *(datetime) --*

                The highest value in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **DecimalColumnStatisticsData** *(dict) --*

              Decimal column statistics data. UnscaledValues within
              are Base64-encoded binary objects storing big-endian,
              two's complement representations of the decimal's
              unscaled value.

              * **MinimumValue** *(dict) --*

                The lowest value in the column.

                * **UnscaledValue** *(bytes) --* **[REQUIRED]**

                  The unscaled numeric value.

                * **Scale** *(integer) --* **[REQUIRED]**

                  The scale that determines where the decimal point
                  falls in the unscaled value.

              * **MaximumValue** *(dict) --*

                The highest value in the column.

                * **UnscaledValue** *(bytes) --* **[REQUIRED]**

                  The unscaled numeric value.

                * **Scale** *(integer) --* **[REQUIRED]**

                  The scale that determines where the decimal point
                  falls in the unscaled value.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **DoubleColumnStatisticsData** *(dict) --*

              Double column statistics data.

              * **MinimumValue** *(float) --*

                The lowest value in the column.

              * **MaximumValue** *(float) --*

                The highest value in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **LongColumnStatisticsData** *(dict) --*

              Long column statistics data.

              * **MinimumValue** *(integer) --*

                The lowest value in the column.

              * **MaximumValue** *(integer) --*

                The highest value in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **StringColumnStatisticsData** *(dict) --*

              String column statistics data.

              * **MaximumLength** *(integer) --* **[REQUIRED]**

                The size of the longest string in the column.

              * **AverageLength** *(float) --* **[REQUIRED]**

                The average string length in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

              * **NumberOfDistinctValues** *(integer) --*
                **[REQUIRED]**

                The number of distinct values in a column.

            * **BinaryColumnStatisticsData** *(dict) --*

              Binary column statistics data.

              * **MaximumLength** *(integer) --* **[REQUIRED]**

                The size of the longest bit sequence in the column.

              * **AverageLength** *(float) --* **[REQUIRED]**

                The average bit sequence length in the column.

              * **NumberOfNulls** *(integer) --* **[REQUIRED]**

                The number of null values in the column.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Errors': [
                 {
                     'ColumnStatistics': {
                         'ColumnName': 'string',
                         'ColumnType': 'string',
                         'AnalyzedTime': datetime(2015, 1, 1),
                         'StatisticsData': {
                             'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                             'BooleanColumnStatisticsData': {
                                 'NumberOfTrues': 123,
                                 'NumberOfFalses': 123,
                                 'NumberOfNulls': 123
                             },
                             'DateColumnStatisticsData': {
                                 'MinimumValue': datetime(2015, 1, 1),
                                 'MaximumValue': datetime(2015, 1, 1),
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'DecimalColumnStatisticsData': {
                                 'MinimumValue': {
                                     'UnscaledValue': b'bytes',
                                     'Scale': 123
                                 },
                                 'MaximumValue': {
                                     'UnscaledValue': b'bytes',
                                     'Scale': 123
                                 },
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'DoubleColumnStatisticsData': {
                                 'MinimumValue': 123.0,
                                 'MaximumValue': 123.0,
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'LongColumnStatisticsData': {
                                 'MinimumValue': 123,
                                 'MaximumValue': 123,
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'StringColumnStatisticsData': {
                                 'MaximumLength': 123,
                                 'AverageLength': 123.0,
                                 'NumberOfNulls': 123,
                                 'NumberOfDistinctValues': 123
                             },
                             'BinaryColumnStatisticsData': {
                                 'MaximumLength': 123,
                                 'AverageLength': 123.0,
                                 'NumberOfNulls': 123
                             }
                         }
                     },
                     'Error': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Errors** *(list) --*

          Error occurred during updating column statistics data.

          * *(dict) --*

            Encapsulates a "ColumnStatistics" object that failed and
            the reason for failure.

            * **ColumnStatistics** *(dict) --*

              The "ColumnStatistics" of the column.

              * **ColumnName** *(string) --*

                Name of column which statistics belong to.

              * **ColumnType** *(string) --*

                The data type of the column.

              * **AnalyzedTime** *(datetime) --*

                The timestamp of when column statistics were
                generated.

              * **StatisticsData** *(dict) --*

                A "ColumnStatisticData" object that contains the
                statistics data values.

                * **Type** *(string) --*

                  The type of column statistics data.

                * **BooleanColumnStatisticsData** *(dict) --*

                  Boolean column statistics data.

                  * **NumberOfTrues** *(integer) --*

                    The number of true values in the column.

                  * **NumberOfFalses** *(integer) --*

                    The number of false values in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                * **DateColumnStatisticsData** *(dict) --*

                  Date column statistics data.

                  * **MinimumValue** *(datetime) --*

                    The lowest value in the column.

                  * **MaximumValue** *(datetime) --*

                    The highest value in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **DecimalColumnStatisticsData** *(dict) --*

                  Decimal column statistics data. UnscaledValues
                  within are Base64-encoded binary objects storing
                  big-endian, two's complement representations of the
                  decimal's unscaled value.

                  * **MinimumValue** *(dict) --*

                    The lowest value in the column.

                    * **UnscaledValue** *(bytes) --*

                      The unscaled numeric value.

                    * **Scale** *(integer) --*

                      The scale that determines where the decimal
                      point falls in the unscaled value.

                  * **MaximumValue** *(dict) --*

                    The highest value in the column.

                    * **UnscaledValue** *(bytes) --*

                      The unscaled numeric value.

                    * **Scale** *(integer) --*

                      The scale that determines where the decimal
                      point falls in the unscaled value.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **DoubleColumnStatisticsData** *(dict) --*

                  Double column statistics data.

                  * **MinimumValue** *(float) --*

                    The lowest value in the column.

                  * **MaximumValue** *(float) --*

                    The highest value in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **LongColumnStatisticsData** *(dict) --*

                  Long column statistics data.

                  * **MinimumValue** *(integer) --*

                    The lowest value in the column.

                  * **MaximumValue** *(integer) --*

                    The highest value in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **StringColumnStatisticsData** *(dict) --*

                  String column statistics data.

                  * **MaximumLength** *(integer) --*

                    The size of the longest string in the column.

                  * **AverageLength** *(float) --*

                    The average string length in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

                  * **NumberOfDistinctValues** *(integer) --*

                    The number of distinct values in a column.

                * **BinaryColumnStatisticsData** *(dict) --*

                  Binary column statistics data.

                  * **MaximumLength** *(integer) --*

                    The size of the longest bit sequence in the
                    column.

                  * **AverageLength** *(float) --*

                    The average bit sequence length in the column.

                  * **NumberOfNulls** *(integer) --*

                    The number of null values in the column.

            * **Error** *(dict) --*

              An error message with the reason for the failure of an
              operation.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / delete_user_defined_function


delete_user_defined_function
****************************

Glue.Client.delete_user_defined_function(**kwargs)

   Deletes an existing function definition from the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_user_defined_function(
          CatalogId='string',
          DatabaseName='string',
          FunctionName='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the function to be deleted is located. If none is supplied,
        the Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the function is
        located.

      * **FunctionName** (*string*) --

        **[REQUIRED]**

        The name of the function definition to be deleted.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / start_data_quality_rule_recommendation_run


start_data_quality_rule_recommendation_run
******************************************

Glue.Client.start_data_quality_rule_recommendation_run(**kwargs)

   Starts a recommendation run that is used to generate rules when you
   don't know what rules to write. Glue Data Quality analyzes the data
   and comes up with recommendations for a potential ruleset. You can
   then triage the ruleset and modify the generated ruleset to your
   liking.

   Recommendation runs are automatically deleted after 90 days.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_data_quality_rule_recommendation_run(
          DataSource={
              'GlueTable': {
                  'DatabaseName': 'string',
                  'TableName': 'string',
                  'CatalogId': 'string',
                  'ConnectionName': 'string',
                  'AdditionalOptions': {
                      'string': 'string'
                  }
              }
          },
          Role='string',
          NumberOfWorkers=123,
          Timeout=123,
          CreatedRulesetName='string',
          DataQualitySecurityConfiguration='string',
          ClientToken='string'
      )

   Parameters:
      * **DataSource** (*dict*) --

        **[REQUIRED]**

        The data source (Glue table) associated with this run.

        * **GlueTable** *(dict) --* **[REQUIRED]**

          An Glue table.

          * **DatabaseName** *(string) --* **[REQUIRED]**

            A database name in the Glue Data Catalog.

          * **TableName** *(string) --* **[REQUIRED]**

            A table name in the Glue Data Catalog.

          * **CatalogId** *(string) --*

            A unique identifier for the Glue Data Catalog.

          * **ConnectionName** *(string) --*

            The name of the connection to the Glue Data Catalog.

          * **AdditionalOptions** *(dict) --*

            Additional options for the table. Currently there are two
            keys supported:

            * "pushDownPredicate": to filter on partitions without
              having to list and read all the files in your dataset.

            * "catalogPartitionPredicate": to use server-side
              partition pruning using partition indexes in the Glue
              Data Catalog.

            * *(string) --*

              * *(string) --*

      * **Role** (*string*) --

        **[REQUIRED]**

        An IAM role supplied to encrypt the results of the run.

      * **NumberOfWorkers** (*integer*) -- The number of "G.1X"
        workers to be used in the run. The default is 5.

      * **Timeout** (*integer*) -- The timeout for a run in minutes.
        This is the maximum time that a run can consume resources
        before it is terminated and enters "TIMEOUT" status. The
        default is 2,880 minutes (48 hours).

      * **CreatedRulesetName** (*string*) -- A name for the ruleset.

      * **DataQualitySecurityConfiguration** (*string*) -- The name of
        the security configuration created with the data quality
        encryption option.

      * **ClientToken** (*string*) -- Used for idempotency and is
        recommended to be set to a random ID (such as a UUID) to avoid
        creating or starting multiple instances of the same resource.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **RunId** *(string) --*

          The unique run identifier associated with this run.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ConflictException"
Glue / Client / get_catalogs


get_catalogs
************

Glue.Client.get_catalogs(**kwargs)

   Retrieves all catalogs defined in a catalog in the Glue Data
   Catalog. For a Redshift-federated catalog use case, this operation
   returns the list of catalogs mapped to Redshift databases in the
   Redshift namespace catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_catalogs(
          ParentCatalogId='string',
          NextToken='string',
          MaxResults=123,
          Recursive=True|False,
          IncludeRoot=True|False
      )

   Parameters:
      * **ParentCatalogId** (*string*) -- The ID of the parent catalog
        in which the catalog resides. If none is provided, the Amazon
        Web Services Account Number is used by default.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

      * **MaxResults** (*integer*) -- The maximum number of catalogs
        to return in one response.

      * **Recursive** (*boolean*) -- Whether to list all catalogs
        across the catalog hierarchy, starting from the
        "ParentCatalogId". Defaults to "false" . When "true", all
        catalog objects in the "ParentCatalogID" hierarchy are
        enumerated in the response.

      * **IncludeRoot** (*boolean*) --

        Whether to list the default catalog in the account and region
        in the response. Defaults to "false". When "true" and
        "ParentCatalogId = NULL | Amazon Web Services Account ID", all
        catalogs and the default catalog are enumerated in the
        response.

        When the "ParentCatalogId" is not equal to null, and this
        attribute is passed as "false" or "true", an
        "InvalidInputException" is thrown.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'CatalogList': [
                 {
                     'CatalogId': 'string',
                     'Name': 'string',
                     'ResourceArn': 'string',
                     'Description': 'string',
                     'Parameters': {
                         'string': 'string'
                     },
                     'CreateTime': datetime(2015, 1, 1),
                     'UpdateTime': datetime(2015, 1, 1),
                     'TargetRedshiftCatalog': {
                         'CatalogArn': 'string'
                     },
                     'FederatedCatalog': {
                         'Identifier': 'string',
                         'ConnectionName': 'string',
                         'ConnectionType': 'string'
                     },
                     'CatalogProperties': {
                         'DataLakeAccessProperties': {
                             'DataLakeAccess': True|False,
                             'DataTransferRole': 'string',
                             'KmsKey': 'string',
                             'ManagedWorkgroupName': 'string',
                             'ManagedWorkgroupStatus': 'string',
                             'RedshiftDatabaseName': 'string',
                             'StatusMessage': 'string',
                             'CatalogType': 'string'
                         },
                         'IcebergOptimizationProperties': {
                             'RoleArn': 'string',
                             'Compaction': {
                                 'string': 'string'
                             },
                             'Retention': {
                                 'string': 'string'
                             },
                             'OrphanFileDeletion': {
                                 'string': 'string'
                             },
                             'LastUpdatedTime': datetime(2015, 1, 1)
                         },
                         'CustomProperties': {
                             'string': 'string'
                         }
                     },
                     'CreateTableDefaultPermissions': [
                         {
                             'Principal': {
                                 'DataLakePrincipalIdentifier': 'string'
                             },
                             'Permissions': [
                                 'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                             ]
                         },
                     ],
                     'CreateDatabaseDefaultPermissions': [
                         {
                             'Principal': {
                                 'DataLakePrincipalIdentifier': 'string'
                             },
                             'Permissions': [
                                 'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                             ]
                         },
                     ],
                     'AllowFullTableExternalDataAccess': 'True'|'False'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **CatalogList** *(list) --*

          An array of "Catalog" objects. A list of "Catalog" objects
          from the specified parent catalog.

          * *(dict) --*

            The catalog object represents a logical grouping of
            databases in the Glue Data Catalog or a federated source.
            You can now create a Redshift-federated catalog or a
            catalog containing resource links to Redshift databases in
            another account or region.

            * **CatalogId** *(string) --*

              The ID of the catalog. To grant access to the default
              catalog, this field should not be provided.

            * **Name** *(string) --*

              The name of the catalog. Cannot be the same as the
              account ID.

            * **ResourceArn** *(string) --*

              The Amazon Resource Name (ARN) assigned to the catalog
              resource.

            * **Description** *(string) --*

              Description string, not more than 2048 bytes long,
              matching the URI address multi-line string pattern. A
              description of the catalog.

            * **Parameters** *(dict) --*

              A map array of key-value pairs that define parameters
              and properties of the catalog.

              * *(string) --*

                * *(string) --*

            * **CreateTime** *(datetime) --*

              The time at which the catalog was created.

            * **UpdateTime** *(datetime) --*

              The time at which the catalog was last updated.

            * **TargetRedshiftCatalog** *(dict) --*

              A "TargetRedshiftCatalog" object that describes a target
              catalog for database resource linking.

              * **CatalogArn** *(string) --*

                The Amazon Resource Name (ARN) of the catalog
                resource.

            * **FederatedCatalog** *(dict) --*

              A "FederatedCatalog" object that points to an entity
              outside the Glue Data Catalog.

              * **Identifier** *(string) --*

                A unique identifier for the federated catalog.

              * **ConnectionName** *(string) --*

                The name of the connection to an external data source,
                for example a Redshift-federated catalog.

              * **ConnectionType** *(string) --*

                The type of connection used to access the federated
                catalog, specifying the protocol or method for
                connection to the external data source.

            * **CatalogProperties** *(dict) --*

              A "CatalogProperties" object that specifies data lake
              access properties and other custom properties.

              * **DataLakeAccessProperties** *(dict) --*

                A "DataLakeAccessProperties" object with input
                properties to configure data lake access for your
                catalog resource in the Glue Data Catalog.

                * **DataLakeAccess** *(boolean) --*

                  Turns on or off data lake access for Apache Spark
                  applications that access Amazon Redshift databases
                  in the Data Catalog.

                * **DataTransferRole** *(string) --*

                  A role that will be assumed by Glue for transferring
                  data into/out of the staging bucket during a query.

                * **KmsKey** *(string) --*

                  An encryption key that will be used for the staging
                  bucket that will be created along with the catalog.

                * **ManagedWorkgroupName** *(string) --*

                  The managed Redshift Serverless compute name that is
                  created for your catalog resource.

                * **ManagedWorkgroupStatus** *(string) --*

                  The managed Redshift Serverless compute status.

                * **RedshiftDatabaseName** *(string) --*

                  The default Redshift database resource name in the
                  managed compute.

                * **StatusMessage** *(string) --*

                  A message that gives more detailed information about
                  the managed workgroup status.

                * **CatalogType** *(string) --*

                  Specifies a federated catalog type for the native
                  catalog resource. The currently supported type is
                  "aws:redshift".

              * **IcebergOptimizationProperties** *(dict) --*

                An "IcebergOptimizationPropertiesOutput" object that
                specifies Iceberg table optimization settings for the
                catalog, including configurations for compaction,
                retention, and orphan file deletion operations.

                * **RoleArn** *(string) --*

                  The Amazon Resource Name (ARN) of the IAM role that
                  is used to perform Iceberg table optimization
                  operations.

                * **Compaction** *(dict) --*

                  A map of key-value pairs that specify configuration
                  parameters for Iceberg table compaction operations,
                  which optimize the layout of data files to improve
                  query performance.

                  * *(string) --*

                    * *(string) --*

                * **Retention** *(dict) --*

                  A map of key-value pairs that specify configuration
                  parameters for Iceberg table retention operations,
                  which manage the lifecycle of table snapshots to
                  control storage costs.

                  * *(string) --*

                    * *(string) --*

                * **OrphanFileDeletion** *(dict) --*

                  A map of key-value pairs that specify configuration
                  parameters for Iceberg orphan file deletion
                  operations, which identify and remove files that are
                  no longer referenced by the table metadata.

                  * *(string) --*

                    * *(string) --*

                * **LastUpdatedTime** *(datetime) --*

                  The timestamp when the Iceberg optimization
                  properties were last updated.

              * **CustomProperties** *(dict) --*

                Additional key-value properties for the catalog, such
                as column statistics optimizations.

                * *(string) --*

                  * *(string) --*

            * **CreateTableDefaultPermissions** *(list) --*

              An array of "PrincipalPermissions" objects. Creates a
              set of default permissions on the table(s) for
              principals. Used by Amazon Web Services Lake Formation.
              Not used in the normal course of Glue operations.

              * *(dict) --*

                Permissions granted to a principal.

                * **Principal** *(dict) --*

                  The principal who is granted permissions.

                  * **DataLakePrincipalIdentifier** *(string) --*

                    An identifier for the Lake Formation principal.

                * **Permissions** *(list) --*

                  The permissions that are granted to the principal.

                  * *(string) --*

            * **CreateDatabaseDefaultPermissions** *(list) --*

              An array of "PrincipalPermissions" objects. Creates a
              set of default permissions on the database(s) for
              principals. Used by Amazon Web Services Lake Formation.
              Not used in the normal course of Glue operations.

              * *(dict) --*

                Permissions granted to a principal.

                * **Principal** *(dict) --*

                  The principal who is granted permissions.

                  * **DataLakePrincipalIdentifier** *(string) --*

                    An identifier for the Lake Formation principal.

                * **Permissions** *(list) --*

                  The permissions that are granted to the principal.

                  * *(string) --*

            * **AllowFullTableExternalDataAccess** *(string) --*

              Allows third-party engines to access data in Amazon S3
              locations that are registered with Lake Formation.

        * **NextToken** *(string) --*

          A continuation token for paginating the returned list of
          tokens, returned if the current segment of the list is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / get_catalog_import_status


get_catalog_import_status
*************************

Glue.Client.get_catalog_import_status(**kwargs)

   Retrieves the status of a migration operation.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_catalog_import_status(
          CatalogId='string'
      )

   Parameters:
      **CatalogId** (*string*) -- The ID of the catalog to migrate.
      Currently, this should be the Amazon Web Services account ID.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ImportStatus': {
                 'ImportCompleted': True|False,
                 'ImportTime': datetime(2015, 1, 1),
                 'ImportedBy': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **ImportStatus** *(dict) --*

          The status of the specified catalog migration.

          * **ImportCompleted** *(boolean) --*

            "True" if the migration has completed, or "False"
            otherwise.

          * **ImportTime** *(datetime) --*

            The time that the migration was started.

          * **ImportedBy** *(string) --*

            The name of the person who initiated the migration.

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / delete_trigger


delete_trigger
**************

Glue.Client.delete_trigger(**kwargs)

   Deletes a specified trigger. If the trigger is not found, no
   exception is thrown.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_trigger(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the trigger to delete.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the trigger that was deleted.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / update_integration_table_properties


update_integration_table_properties
***********************************

Glue.Client.update_integration_table_properties(**kwargs)

   This API is used to provide optional override properties for the
   tables that need to be replicated. These properties can include
   properties for filtering and partitioning for the source and target
   tables. To set both source and target properties the same API need
   to be invoked with the Glue connection ARN as "ResourceArn" with
   "SourceTableConfig", and the Glue database ARN as "ResourceArn"
   with "TargetTableConfig" respectively.

   The override will be reflected across all the integrations using
   same "ResourceArn" and source table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_integration_table_properties(
          ResourceArn='string',
          TableName='string',
          SourceTableConfig={
              'Fields': [
                  'string',
              ],
              'FilterPredicate': 'string',
              'PrimaryKey': [
                  'string',
              ],
              'RecordUpdateField': 'string'
          },
          TargetTableConfig={
              'UnnestSpec': 'TOPLEVEL'|'FULL'|'NOUNNEST',
              'PartitionSpec': [
                  {
                      'FieldName': 'string',
                      'FunctionSpec': 'string',
                      'ConversionSpec': 'string'
                  },
              ],
              'TargetTableName': 'string'
          }
      )

   Parameters:
      * **ResourceArn** (*string*) --

        **[REQUIRED]**

        The connection ARN of the source, or the database ARN of the
        target.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table to be replicated.

      * **SourceTableConfig** (*dict*) --

        A structure for the source table configuration.

        * **Fields** *(list) --*

          A list of fields used for column-level filtering. Currently
          unsupported.

          * *(string) --*

        * **FilterPredicate** *(string) --*

          A condition clause used for row-level filtering. Currently
          unsupported.

        * **PrimaryKey** *(list) --*

          Provide the primary key set for this table. Currently
          supported specifically for SAP "EntityOf" entities upon
          request. Contact Amazon Web Services Support to make this
          feature available.

          * *(string) --*

        * **RecordUpdateField** *(string) --*

          Incremental pull timestamp-based field. Currently
          unsupported.

      * **TargetTableConfig** (*dict*) --

        A structure for the target table configuration.

        * **UnnestSpec** *(string) --*

          Specifies how nested objects are flattened to top-level
          elements. Valid values are: "TOPLEVEL", "FULL", or
          "NOUNNEST".

        * **PartitionSpec** *(list) --*

          Determines the file layout on the target.

          * *(dict) --*

            A structure that describes how data is partitioned on the
            target.

            * **FieldName** *(string) --*

              The field name used to partition data on the target.
              Avoid using columns that have unique values for each row
              (for example, *LastModifiedTimestamp*,
              *SystemModTimeStamp*) as the partition column. These
              columns are not suitable for partitioning because they
              create a large number of small partitions, which can
              lead to performance issues.

            * **FunctionSpec** *(string) --*

              Specifies the function used to partition data on the
              target. The only accepted value for this parameter is
              *'identity'* (string). The *'identity'* function ensures
              that the data partitioning on the target follows the
              same scheme as the source. In other words, the
              partitioning structure of the source data is preserved
              in the target destination.

            * **ConversionSpec** *(string) --*

              Specifies the timestamp format of the source data. Valid
              values are:

              * "epoch_sec" - Unix epoch timestamp in seconds

              * "epoch_milli" - Unix epoch timestamp in milliseconds

              * "iso" - ISO 8601 formatted timestamp

              Note:

                Only specify "ConversionSpec" when using timestamp-
                based partition functions (year, month, day, or hour).
                Glue Zero-ETL uses this parameter to correctly
                transform source data into timestamp format before
                partitioning.Do not use high-cardinality columns with
                the "identity" partition function. High-cardinality
                columns include:

                * Primary keys

                * Timestamp fields (such as "LastModifiedTimestamp",
                  "CreatedDate")

                * System-generated timestamps

                Using high-cardinality columns with identity
                partitioning creates many small partitions, which can
                significantly degrade ingestion performance.

        * **TargetTableName** *(string) --*

          The optional name of a target table.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ResourceNotFoundException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / get_mapping


get_mapping
***********

Glue.Client.get_mapping(**kwargs)

   Creates mappings.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_mapping(
          Source={
              'DatabaseName': 'string',
              'TableName': 'string'
          },
          Sinks=[
              {
                  'DatabaseName': 'string',
                  'TableName': 'string'
              },
          ],
          Location={
              'Jdbc': [
                  {
                      'Name': 'string',
                      'Value': 'string',
                      'Param': True|False
                  },
              ],
              'S3': [
                  {
                      'Name': 'string',
                      'Value': 'string',
                      'Param': True|False
                  },
              ],
              'DynamoDB': [
                  {
                      'Name': 'string',
                      'Value': 'string',
                      'Param': True|False
                  },
              ]
          }
      )

   Parameters:
      * **Source** (*dict*) --

        **[REQUIRED]**

        Specifies the source table.

        * **DatabaseName** *(string) --* **[REQUIRED]**

          The database in which the table metadata resides.

        * **TableName** *(string) --* **[REQUIRED]**

          The name of the table in question.

      * **Sinks** (*list*) --

        A list of target tables.

        * *(dict) --*

          Specifies a table definition in the Glue Data Catalog.

          * **DatabaseName** *(string) --* **[REQUIRED]**

            The database in which the table metadata resides.

          * **TableName** *(string) --* **[REQUIRED]**

            The name of the table in question.

      * **Location** (*dict*) --

        Parameters for the mapping.

        * **Jdbc** *(list) --*

          A JDBC location.

          * *(dict) --*

            An argument or property of a node.

            * **Name** *(string) --* **[REQUIRED]**

              The name of the argument or property.

            * **Value** *(string) --* **[REQUIRED]**

              The value of the argument or property.

            * **Param** *(boolean) --*

              True if the value is used as a parameter.

        * **S3** *(list) --*

          An Amazon Simple Storage Service (Amazon S3) location.

          * *(dict) --*

            An argument or property of a node.

            * **Name** *(string) --* **[REQUIRED]**

              The name of the argument or property.

            * **Value** *(string) --* **[REQUIRED]**

              The value of the argument or property.

            * **Param** *(boolean) --*

              True if the value is used as a parameter.

        * **DynamoDB** *(list) --*

          An Amazon DynamoDB table location.

          * *(dict) --*

            An argument or property of a node.

            * **Name** *(string) --* **[REQUIRED]**

              The name of the argument or property.

            * **Value** *(string) --* **[REQUIRED]**

              The value of the argument or property.

            * **Param** *(boolean) --*

              True if the value is used as a parameter.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Mapping': [
                 {
                     'SourceTable': 'string',
                     'SourcePath': 'string',
                     'SourceType': 'string',
                     'TargetTable': 'string',
                     'TargetPath': 'string',
                     'TargetType': 'string'
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Mapping** *(list) --*

          A list of mappings to the specified targets.

          * *(dict) --*

            Defines a mapping.

            * **SourceTable** *(string) --*

              The name of the source table.

            * **SourcePath** *(string) --*

              The source path.

            * **SourceType** *(string) --*

              The source type.

            * **TargetTable** *(string) --*

              The target table.

            * **TargetPath** *(string) --*

              The target path.

            * **TargetType** *(string) --*

              The target type.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.EntityNotFoundException"
Glue / Client / get_data_quality_model


get_data_quality_model
**********************

Glue.Client.get_data_quality_model(**kwargs)

   Retrieve the training status of the model along with more
   information (CompletedOn, StartedOn, FailureReason).

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_data_quality_model(
          StatisticId='string',
          ProfileId='string'
      )

   Parameters:
      * **StatisticId** (*string*) -- The Statistic ID.

      * **ProfileId** (*string*) --

        **[REQUIRED]**

        The Profile ID.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Status': 'RUNNING'|'SUCCEEDED'|'FAILED',
             'StartedOn': datetime(2015, 1, 1),
             'CompletedOn': datetime(2015, 1, 1),
             'FailureReason': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Status** *(string) --*

          The training status of the data quality model.

        * **StartedOn** *(datetime) --*

          The timestamp when the data quality model training started.

        * **CompletedOn** *(datetime) --*

          The timestamp when the data quality model training
          completed.

        * **FailureReason** *(string) --*

          The training failure reason.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / batch_get_partition


batch_get_partition
*******************

Glue.Client.batch_get_partition(**kwargs)

   Retrieves partitions in a batch request.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionsToGet=[
              {
                  'Values': [
                      'string',
                  ]
              },
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partitions in question reside. If none is supplied, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partitions reside.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the partitions' table.

      * **PartitionsToGet** (*list*) --

        **[REQUIRED]**

        A list of partition values identifying the partitions to
        retrieve.

        * *(dict) --*

          Contains a list of values defining partitions.

          * **Values** *(list) --* **[REQUIRED]**

            The list of values.

            * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Partitions': [
                 {
                     'Values': [
                         'string',
                     ],
                     'DatabaseName': 'string',
                     'TableName': 'string',
                     'CreationTime': datetime(2015, 1, 1),
                     'LastAccessTime': datetime(2015, 1, 1),
                     'StorageDescriptor': {
                         'Columns': [
                             {
                                 'Name': 'string',
                                 'Type': 'string',
                                 'Comment': 'string',
                                 'Parameters': {
                                     'string': 'string'
                                 }
                             },
                         ],
                         'Location': 'string',
                         'AdditionalLocations': [
                             'string',
                         ],
                         'InputFormat': 'string',
                         'OutputFormat': 'string',
                         'Compressed': True|False,
                         'NumberOfBuckets': 123,
                         'SerdeInfo': {
                             'Name': 'string',
                             'SerializationLibrary': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                         'BucketColumns': [
                             'string',
                         ],
                         'SortColumns': [
                             {
                                 'Column': 'string',
                                 'SortOrder': 123
                             },
                         ],
                         'Parameters': {
                             'string': 'string'
                         },
                         'SkewedInfo': {
                             'SkewedColumnNames': [
                                 'string',
                             ],
                             'SkewedColumnValues': [
                                 'string',
                             ],
                             'SkewedColumnValueLocationMaps': {
                                 'string': 'string'
                             }
                         },
                         'StoredAsSubDirectories': True|False,
                         'SchemaReference': {
                             'SchemaId': {
                                 'SchemaArn': 'string',
                                 'SchemaName': 'string',
                                 'RegistryName': 'string'
                             },
                             'SchemaVersionId': 'string',
                             'SchemaVersionNumber': 123
                         }
                     },
                     'Parameters': {
                         'string': 'string'
                     },
                     'LastAnalyzedTime': datetime(2015, 1, 1),
                     'CatalogId': 'string'
                 },
             ],
             'UnprocessedKeys': [
                 {
                     'Values': [
                         'string',
                     ]
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Partitions** *(list) --*

          A list of the requested partitions.

          * *(dict) --*

            Represents a slice of table data.

            * **Values** *(list) --*

              The values of the partition.

              * *(string) --*

            * **DatabaseName** *(string) --*

              The name of the catalog database in which to create the
              partition.

            * **TableName** *(string) --*

              The name of the database table in which to create the
              partition.

            * **CreationTime** *(datetime) --*

              The time at which the partition was created.

            * **LastAccessTime** *(datetime) --*

              The last time at which the partition was accessed.

            * **StorageDescriptor** *(dict) --*

              Provides information about the physical location where
              the partition is stored.

              * **Columns** *(list) --*

                A list of the "Columns" in the table.

                * *(dict) --*

                  A column in a "Table".

                  * **Name** *(string) --*

                    The name of the "Column".

                  * **Type** *(string) --*

                    The data type of the "Column".

                  * **Comment** *(string) --*

                    A free-form text comment.

                  * **Parameters** *(dict) --*

                    These key-value pairs define properties associated
                    with the column.

                    * *(string) --*

                      * *(string) --*

              * **Location** *(string) --*

                The physical location of the table. By default, this
                takes the form of the warehouse location, followed by
                the database location in the warehouse, followed by
                the table name.

              * **AdditionalLocations** *(list) --*

                A list of locations that point to the path where a
                Delta table is located.

                * *(string) --*

              * **InputFormat** *(string) --*

                The input format: "SequenceFileInputFormat" (binary),
                or "TextInputFormat", or a custom format.

              * **OutputFormat** *(string) --*

                The output format: "SequenceFileOutputFormat"
                (binary), or "IgnoreKeyTextOutputFormat", or a custom
                format.

              * **Compressed** *(boolean) --*

                "True" if the data in the table is compressed, or
                "False" if not.

              * **NumberOfBuckets** *(integer) --*

                Must be specified if the table contains any dimension
                columns.

              * **SerdeInfo** *(dict) --*

                The serialization/deserialization (SerDe) information.

                * **Name** *(string) --*

                  Name of the SerDe.

                * **SerializationLibrary** *(string) --*

                  Usually the class that implements the SerDe. An
                  example is "org.apache.hadoop.hive.serde2.columnar.
                  ColumnarSerDe".

                * **Parameters** *(dict) --*

                  These key-value pairs define initialization
                  parameters for the SerDe.

                  * *(string) --*

                    * *(string) --*

              * **BucketColumns** *(list) --*

                A list of reducer grouping columns, clustering
                columns, and bucketing columns in the table.

                * *(string) --*

              * **SortColumns** *(list) --*

                A list specifying the sort order of each bucket in the
                table.

                * *(dict) --*

                  Specifies the sort order of a sorted column.

                  * **Column** *(string) --*

                    The name of the column.

                  * **SortOrder** *(integer) --*

                    Indicates that the column is sorted in ascending
                    order ( "== 1"), or in descending order ( "==0").

              * **Parameters** *(dict) --*

                The user-supplied properties in key-value form.

                * *(string) --*

                  * *(string) --*

              * **SkewedInfo** *(dict) --*

                The information about values that appear frequently in
                a column (skewed values).

                * **SkewedColumnNames** *(list) --*

                  A list of names of columns that contain skewed
                  values.

                  * *(string) --*

                * **SkewedColumnValues** *(list) --*

                  A list of values that appear so frequently as to be
                  considered skewed.

                  * *(string) --*

                * **SkewedColumnValueLocationMaps** *(dict) --*

                  A mapping of skewed values to the columns that
                  contain them.

                  * *(string) --*

                    * *(string) --*

              * **StoredAsSubDirectories** *(boolean) --*

                "True" if the table data is stored in subdirectories,
                or "False" if not.

              * **SchemaReference** *(dict) --*

                An object that references a schema stored in the Glue
                Schema Registry.

                When creating a table, you can pass an empty list of
                columns for the schema, and instead use a schema
                reference.

                * **SchemaId** *(dict) --*

                  A structure that contains schema identity fields.
                  Either this or the "SchemaVersionId" has to be
                  provided.

                  * **SchemaArn** *(string) --*

                    The Amazon Resource Name (ARN) of the schema. One
                    of "SchemaArn" or "SchemaName" has to be provided.

                  * **SchemaName** *(string) --*

                    The name of the schema. One of "SchemaArn" or
                    "SchemaName" has to be provided.

                  * **RegistryName** *(string) --*

                    The name of the schema registry that contains the
                    schema.

                * **SchemaVersionId** *(string) --*

                  The unique ID assigned to a version of the schema.
                  Either this or the "SchemaId" has to be provided.

                * **SchemaVersionNumber** *(integer) --*

                  The version number of the schema.

            * **Parameters** *(dict) --*

              These key-value pairs define partition parameters.

              * *(string) --*

                * *(string) --*

            * **LastAnalyzedTime** *(datetime) --*

              The last time at which column statistics were computed
              for this partition.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the partition
              resides.

        * **UnprocessedKeys** *(list) --*

          A list of the partition values in the request for which
          partitions were not returned.

          * *(dict) --*

            Contains a list of values defining partitions.

            * **Values** *(list) --*

              The list of values.

              * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.InvalidStateException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / get_partition


get_partition
*************

Glue.Client.get_partition(**kwargs)

   Retrieves information about a specified partition.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionValues=[
              'string',
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partition in question resides. If none is provided, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partition resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the partition's table.

      * **PartitionValues** (*list*) --

        **[REQUIRED]**

        The values that define the partition.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Partition': {
                 'Values': [
                     'string',
                 ],
                 'DatabaseName': 'string',
                 'TableName': 'string',
                 'CreationTime': datetime(2015, 1, 1),
                 'LastAccessTime': datetime(2015, 1, 1),
                 'StorageDescriptor': {
                     'Columns': [
                         {
                             'Name': 'string',
                             'Type': 'string',
                             'Comment': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                     ],
                     'Location': 'string',
                     'AdditionalLocations': [
                         'string',
                     ],
                     'InputFormat': 'string',
                     'OutputFormat': 'string',
                     'Compressed': True|False,
                     'NumberOfBuckets': 123,
                     'SerdeInfo': {
                         'Name': 'string',
                         'SerializationLibrary': 'string',
                         'Parameters': {
                             'string': 'string'
                         }
                     },
                     'BucketColumns': [
                         'string',
                     ],
                     'SortColumns': [
                         {
                             'Column': 'string',
                             'SortOrder': 123
                         },
                     ],
                     'Parameters': {
                         'string': 'string'
                     },
                     'SkewedInfo': {
                         'SkewedColumnNames': [
                             'string',
                         ],
                         'SkewedColumnValues': [
                             'string',
                         ],
                         'SkewedColumnValueLocationMaps': {
                             'string': 'string'
                         }
                     },
                     'StoredAsSubDirectories': True|False,
                     'SchemaReference': {
                         'SchemaId': {
                             'SchemaArn': 'string',
                             'SchemaName': 'string',
                             'RegistryName': 'string'
                         },
                         'SchemaVersionId': 'string',
                         'SchemaVersionNumber': 123
                     }
                 },
                 'Parameters': {
                     'string': 'string'
                 },
                 'LastAnalyzedTime': datetime(2015, 1, 1),
                 'CatalogId': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Partition** *(dict) --*

          The requested information, in the form of a "Partition"
          object.

          * **Values** *(list) --*

            The values of the partition.

            * *(string) --*

          * **DatabaseName** *(string) --*

            The name of the catalog database in which to create the
            partition.

          * **TableName** *(string) --*

            The name of the database table in which to create the
            partition.

          * **CreationTime** *(datetime) --*

            The time at which the partition was created.

          * **LastAccessTime** *(datetime) --*

            The last time at which the partition was accessed.

          * **StorageDescriptor** *(dict) --*

            Provides information about the physical location where the
            partition is stored.

            * **Columns** *(list) --*

              A list of the "Columns" in the table.

              * *(dict) --*

                A column in a "Table".

                * **Name** *(string) --*

                  The name of the "Column".

                * **Type** *(string) --*

                  The data type of the "Column".

                * **Comment** *(string) --*

                  A free-form text comment.

                * **Parameters** *(dict) --*

                  These key-value pairs define properties associated
                  with the column.

                  * *(string) --*

                    * *(string) --*

            * **Location** *(string) --*

              The physical location of the table. By default, this
              takes the form of the warehouse location, followed by
              the database location in the warehouse, followed by the
              table name.

            * **AdditionalLocations** *(list) --*

              A list of locations that point to the path where a Delta
              table is located.

              * *(string) --*

            * **InputFormat** *(string) --*

              The input format: "SequenceFileInputFormat" (binary), or
              "TextInputFormat", or a custom format.

            * **OutputFormat** *(string) --*

              The output format: "SequenceFileOutputFormat" (binary),
              or "IgnoreKeyTextOutputFormat", or a custom format.

            * **Compressed** *(boolean) --*

              "True" if the data in the table is compressed, or
              "False" if not.

            * **NumberOfBuckets** *(integer) --*

              Must be specified if the table contains any dimension
              columns.

            * **SerdeInfo** *(dict) --*

              The serialization/deserialization (SerDe) information.

              * **Name** *(string) --*

                Name of the SerDe.

              * **SerializationLibrary** *(string) --*

                Usually the class that implements the SerDe. An
                example is "org.apache.hadoop.hive.serde2.columnar.Co
                lumnarSerDe".

              * **Parameters** *(dict) --*

                These key-value pairs define initialization parameters
                for the SerDe.

                * *(string) --*

                  * *(string) --*

            * **BucketColumns** *(list) --*

              A list of reducer grouping columns, clustering columns,
              and bucketing columns in the table.

              * *(string) --*

            * **SortColumns** *(list) --*

              A list specifying the sort order of each bucket in the
              table.

              * *(dict) --*

                Specifies the sort order of a sorted column.

                * **Column** *(string) --*

                  The name of the column.

                * **SortOrder** *(integer) --*

                  Indicates that the column is sorted in ascending
                  order ( "== 1"), or in descending order ( "==0").

            * **Parameters** *(dict) --*

              The user-supplied properties in key-value form.

              * *(string) --*

                * *(string) --*

            * **SkewedInfo** *(dict) --*

              The information about values that appear frequently in a
              column (skewed values).

              * **SkewedColumnNames** *(list) --*

                A list of names of columns that contain skewed values.

                * *(string) --*

              * **SkewedColumnValues** *(list) --*

                A list of values that appear so frequently as to be
                considered skewed.

                * *(string) --*

              * **SkewedColumnValueLocationMaps** *(dict) --*

                A mapping of skewed values to the columns that contain
                them.

                * *(string) --*

                  * *(string) --*

            * **StoredAsSubDirectories** *(boolean) --*

              "True" if the table data is stored in subdirectories, or
              "False" if not.

            * **SchemaReference** *(dict) --*

              An object that references a schema stored in the Glue
              Schema Registry.

              When creating a table, you can pass an empty list of
              columns for the schema, and instead use a schema
              reference.

              * **SchemaId** *(dict) --*

                A structure that contains schema identity fields.
                Either this or the "SchemaVersionId" has to be
                provided.

                * **SchemaArn** *(string) --*

                  The Amazon Resource Name (ARN) of the schema. One of
                  "SchemaArn" or "SchemaName" has to be provided.

                * **SchemaName** *(string) --*

                  The name of the schema. One of "SchemaArn" or
                  "SchemaName" has to be provided.

                * **RegistryName** *(string) --*

                  The name of the schema registry that contains the
                  schema.

              * **SchemaVersionId** *(string) --*

                The unique ID assigned to a version of the schema.
                Either this or the "SchemaId" has to be provided.

              * **SchemaVersionNumber** *(integer) --*

                The version number of the schema.

          * **Parameters** *(dict) --*

            These key-value pairs define partition parameters.

            * *(string) --*

              * *(string) --*

          * **LastAnalyzedTime** *(datetime) --*

            The last time at which column statistics were computed for
            this partition.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the partition resides.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / delete_table


delete_table
************

Glue.Client.delete_table(**kwargs)

   Removes a table definition from the Data Catalog.

   Note:

     After completing this operation, you no longer have access to the
     table versions and partitions that belong to the deleted table.
     Glue deletes these "orphaned" resources asynchronously in a
     timely manner, at the discretion of the service.To ensure the
     immediate deletion of all related resources, before calling
     "DeleteTable", use "DeleteTableVersion" or
     "BatchDeleteTableVersion", and "DeletePartition" or
     "BatchDeletePartition", to delete any resources that belong to
     the table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_table(
          CatalogId='string',
          DatabaseName='string',
          Name='string',
          TransactionId='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the table resides. If none is provided, the Amazon Web
        Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database in which the table resides.
        For Hive compatibility, this name is entirely lowercase.

      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the table to be deleted. For Hive compatibility,
        this name is entirely lowercase.

      * **TransactionId** (*string*) -- The transaction ID at which to
        delete the table contents.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.ResourceNotReadyException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / run_statement


run_statement
*************

Glue.Client.run_statement(**kwargs)

   Executes the statement.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.run_statement(
          SessionId='string',
          Code='string',
          RequestOrigin='string'
      )

   Parameters:
      * **SessionId** (*string*) --

        **[REQUIRED]**

        The Session Id of the statement to be run.

      * **Code** (*string*) --

        **[REQUIRED]**

        The statement code to be run.

      * **RequestOrigin** (*string*) -- The origin of the request.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Id': 123
         }

      **Response Structure**

      * *(dict) --*

        * **Id** *(integer) --*

          Returns the Id of the statement that was run.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.IllegalSessionStateException"
Glue / Client / get_workflow_run_properties


get_workflow_run_properties
***************************

Glue.Client.get_workflow_run_properties(**kwargs)

   Retrieves the workflow run properties which were set during the
   run.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_workflow_run_properties(
          Name='string',
          RunId='string'
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        Name of the workflow which was run.

      * **RunId** (*string*) --

        **[REQUIRED]**

        The ID of the workflow run whose run properties should be
        returned.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RunProperties': {
                 'string': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **RunProperties** *(dict) --*

          The workflow run properties which were set during the
          specified run.

          * *(string) --*

            * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / create_column_statistics_task_settings


create_column_statistics_task_settings
**************************************

Glue.Client.create_column_statistics_task_settings(**kwargs)

   Creates settings for a column statistics task.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_column_statistics_task_settings(
          DatabaseName='string',
          TableName='string',
          Role='string',
          Schedule='string',
          ColumnNameList=[
              'string',
          ],
          SampleSize=123.0,
          CatalogID='string',
          SecurityConfiguration='string',
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database where the table resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table for which to generate column statistics.

      * **Role** (*string*) --

        **[REQUIRED]**

        The role used for running the column statistics.

      * **Schedule** (*string*) -- A schedule for running the column
        statistics, specified in CRON syntax.

      * **ColumnNameList** (*list*) --

        A list of column names for which to run statistics.

        * *(string) --*

      * **SampleSize** (*float*) -- The percentage of data to sample.

      * **CatalogID** (*string*) -- The ID of the Data Catalog in
        which the database resides.

      * **SecurityConfiguration** (*string*) -- Name of the security
        configuration that is used to encrypt CloudWatch logs.

      * **Tags** (*dict*) --

        A map of tags.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ColumnStatisticsTaskRunningException"
Glue / Client / get_databases


get_databases
*************

Glue.Client.get_databases(**kwargs)

   Retrieves all databases defined in a given Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_databases(
          CatalogId='string',
          NextToken='string',
          MaxResults=123,
          ResourceShareType='FOREIGN'|'ALL'|'FEDERATED',
          AttributesToGet=[
              'NAME',
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog from
        which to retrieve "Databases". If none is provided, the Amazon
        Web Services account ID is used by default.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

      * **MaxResults** (*integer*) -- The maximum number of databases
        to return in one response.

      * **ResourceShareType** (*string*) --

        Allows you to specify that you want to list the databases
        shared with your account. The allowable values are
        "FEDERATED", "FOREIGN" or "ALL".

        * If set to "FEDERATED", will list the federated databases
          (referencing an external entity) shared with your account.

        * If set to "FOREIGN", will list the databases shared with
          your account.

        * If set to "ALL", will list the databases shared with your
          account, as well as the databases in yor local account.

      * **AttributesToGet** (*list*) --

        Specifies the database fields returned by the "GetDatabases"
        call. This parameter doesn’t accept an empty list. The request
        must include the "NAME".

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'DatabaseList': [
                 {
                     'Name': 'string',
                     'Description': 'string',
                     'LocationUri': 'string',
                     'Parameters': {
                         'string': 'string'
                     },
                     'CreateTime': datetime(2015, 1, 1),
                     'CreateTableDefaultPermissions': [
                         {
                             'Principal': {
                                 'DataLakePrincipalIdentifier': 'string'
                             },
                             'Permissions': [
                                 'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                             ]
                         },
                     ],
                     'TargetDatabase': {
                         'CatalogId': 'string',
                         'DatabaseName': 'string',
                         'Region': 'string'
                     },
                     'CatalogId': 'string',
                     'FederatedDatabase': {
                         'Identifier': 'string',
                         'ConnectionName': 'string',
                         'ConnectionType': 'string'
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **DatabaseList** *(list) --*

          A list of "Database" objects from the specified catalog.

          * *(dict) --*

            The "Database" object represents a logical grouping of
            tables that might reside in a Hive metastore or an RDBMS.

            * **Name** *(string) --*

              The name of the database. For Hive compatibility, this
              is folded to lowercase when it is stored.

            * **Description** *(string) --*

              A description of the database.

            * **LocationUri** *(string) --*

              The location of the database (for example, an HDFS
              path).

            * **Parameters** *(dict) --*

              These key-value pairs define parameters and properties
              of the database.

              * *(string) --*

                * *(string) --*

            * **CreateTime** *(datetime) --*

              The time at which the metadata database was created in
              the catalog.

            * **CreateTableDefaultPermissions** *(list) --*

              Creates a set of default permissions on the table for
              principals. Used by Lake Formation. Not used in the
              normal course of Glue operations.

              * *(dict) --*

                Permissions granted to a principal.

                * **Principal** *(dict) --*

                  The principal who is granted permissions.

                  * **DataLakePrincipalIdentifier** *(string) --*

                    An identifier for the Lake Formation principal.

                * **Permissions** *(list) --*

                  The permissions that are granted to the principal.

                  * *(string) --*

            * **TargetDatabase** *(dict) --*

              A "DatabaseIdentifier" structure that describes a target
              database for resource linking.

              * **CatalogId** *(string) --*

                The ID of the Data Catalog in which the database
                resides.

              * **DatabaseName** *(string) --*

                The name of the catalog database.

              * **Region** *(string) --*

                Region of the target database.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the database
              resides.

            * **FederatedDatabase** *(dict) --*

              A "FederatedDatabase" structure that references an
              entity outside the Glue Data Catalog.

              * **Identifier** *(string) --*

                A unique identifier for the federated database.

              * **ConnectionName** *(string) --*

                The name of the connection to the external metastore.

              * **ConnectionType** *(string) --*

                The type of connection used to access the federated
                database, such as JDBC, ODBC, or other supported
                connection protocols.

        * **NextToken** *(string) --*

          A continuation token for paginating the returned list of
          tokens, returned if the current segment of the list is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / get_plan


get_plan
********

Glue.Client.get_plan(**kwargs)

   Gets code to perform a specified mapping.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_plan(
          Mapping=[
              {
                  'SourceTable': 'string',
                  'SourcePath': 'string',
                  'SourceType': 'string',
                  'TargetTable': 'string',
                  'TargetPath': 'string',
                  'TargetType': 'string'
              },
          ],
          Source={
              'DatabaseName': 'string',
              'TableName': 'string'
          },
          Sinks=[
              {
                  'DatabaseName': 'string',
                  'TableName': 'string'
              },
          ],
          Location={
              'Jdbc': [
                  {
                      'Name': 'string',
                      'Value': 'string',
                      'Param': True|False
                  },
              ],
              'S3': [
                  {
                      'Name': 'string',
                      'Value': 'string',
                      'Param': True|False
                  },
              ],
              'DynamoDB': [
                  {
                      'Name': 'string',
                      'Value': 'string',
                      'Param': True|False
                  },
              ]
          },
          Language='PYTHON'|'SCALA',
          AdditionalPlanOptionsMap={
              'string': 'string'
          }
      )

   Parameters:
      * **Mapping** (*list*) --

        **[REQUIRED]**

        The list of mappings from a source table to target tables.

        * *(dict) --*

          Defines a mapping.

          * **SourceTable** *(string) --*

            The name of the source table.

          * **SourcePath** *(string) --*

            The source path.

          * **SourceType** *(string) --*

            The source type.

          * **TargetTable** *(string) --*

            The target table.

          * **TargetPath** *(string) --*

            The target path.

          * **TargetType** *(string) --*

            The target type.

      * **Source** (*dict*) --

        **[REQUIRED]**

        The source table.

        * **DatabaseName** *(string) --* **[REQUIRED]**

          The database in which the table metadata resides.

        * **TableName** *(string) --* **[REQUIRED]**

          The name of the table in question.

      * **Sinks** (*list*) --

        The target tables.

        * *(dict) --*

          Specifies a table definition in the Glue Data Catalog.

          * **DatabaseName** *(string) --* **[REQUIRED]**

            The database in which the table metadata resides.

          * **TableName** *(string) --* **[REQUIRED]**

            The name of the table in question.

      * **Location** (*dict*) --

        The parameters for the mapping.

        * **Jdbc** *(list) --*

          A JDBC location.

          * *(dict) --*

            An argument or property of a node.

            * **Name** *(string) --* **[REQUIRED]**

              The name of the argument or property.

            * **Value** *(string) --* **[REQUIRED]**

              The value of the argument or property.

            * **Param** *(boolean) --*

              True if the value is used as a parameter.

        * **S3** *(list) --*

          An Amazon Simple Storage Service (Amazon S3) location.

          * *(dict) --*

            An argument or property of a node.

            * **Name** *(string) --* **[REQUIRED]**

              The name of the argument or property.

            * **Value** *(string) --* **[REQUIRED]**

              The value of the argument or property.

            * **Param** *(boolean) --*

              True if the value is used as a parameter.

        * **DynamoDB** *(list) --*

          An Amazon DynamoDB table location.

          * *(dict) --*

            An argument or property of a node.

            * **Name** *(string) --* **[REQUIRED]**

              The name of the argument or property.

            * **Value** *(string) --* **[REQUIRED]**

              The value of the argument or property.

            * **Param** *(boolean) --*

              True if the value is used as a parameter.

      * **Language** (*string*) -- The programming language of the
        code to perform the mapping.

      * **AdditionalPlanOptionsMap** (*dict*) --

        A map to hold additional optional key-value parameters.

        Currently, these key-value pairs are supported:

        * "inferSchema"  — Specifies whether to set "inferSchema" to
          true or false for the default script generated by an Glue
          job. For example, to set "inferSchema" to true, pass the
          following key value pair: "--additional-plan-options-map
          '{"inferSchema":"true"}'"

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'PythonScript': 'string',
             'ScalaCode': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **PythonScript** *(string) --*

          A Python script to perform the mapping.

        * **ScalaCode** *(string) --*

          The Scala code to perform the mapping.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / describe_inbound_integrations


describe_inbound_integrations
*****************************

Glue.Client.describe_inbound_integrations(**kwargs)

   Returns a list of inbound integrations for the specified
   integration.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.describe_inbound_integrations(
          IntegrationArn='string',
          Marker='string',
          MaxRecords=123,
          TargetArn='string'
      )

   Parameters:
      * **IntegrationArn** (*string*) -- The Amazon Resource Name
        (ARN) of the integration.

      * **Marker** (*string*) -- A token to specify where to start
        paginating. This is the marker from a previously truncated
        response.

      * **MaxRecords** (*integer*) -- The total number of items to
        return in the output.

      * **TargetArn** (*string*) -- The Amazon Resource Name (ARN) of
        the target resource in the integration.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'InboundIntegrations': [
                 {
                     'SourceArn': 'string',
                     'TargetArn': 'string',
                     'IntegrationArn': 'string',
                     'Status': 'CREATING'|'ACTIVE'|'MODIFYING'|'FAILED'|'DELETING'|'SYNCING'|'NEEDS_ATTENTION',
                     'CreateTime': datetime(2015, 1, 1),
                     'IntegrationConfig': {
                         'RefreshInterval': 'string',
                         'SourceProperties': {
                             'string': 'string'
                         }
                     },
                     'Errors': [
                         {
                             'ErrorCode': 'string',
                             'ErrorMessage': 'string'
                         },
                     ]
                 },
             ],
             'Marker': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **InboundIntegrations** *(list) --*

          A list of inbound integrations.

          * *(dict) --*

            A structure for an integration that writes data into a
            resource.

            * **SourceArn** *(string) --*

              The ARN of the source resource for the integration.

            * **TargetArn** *(string) --*

              The ARN of the target resource for the integration.

            * **IntegrationArn** *(string) --*

              The ARN of the zero-ETL integration.

            * **Status** *(string) --*

              The possible statuses are:

              * CREATING: The integration is being created.

              * ACTIVE: The integration creation succeeds.

              * MODIFYING: The integration is being modified.

              * FAILED: The integration creation fails.

              * DELETING: The integration is deleted.

              * SYNCING: The integration is synchronizing.

              * NEEDS_ATTENTION: The integration needs attention, such
                as synchronization.

            * **CreateTime** *(datetime) --*

              The time that the integration was created, in UTC.

            * **IntegrationConfig** *(dict) --*

              Properties associated with the integration.

              * **RefreshInterval** *(string) --*

                Specifies the frequency at which CDC (Change Data
                Capture) pulls or incremental loads should occur. This
                parameter provides flexibility to align the refresh
                rate with your specific data update patterns, system
                load considerations, and performance optimization
                goals. Time increment can be set from 15 minutes to
                8640 minutes (six days). Currently supports creation
                of "RefreshInterval" only.

              * **SourceProperties** *(dict) --*

                A collection of key-value pairs that specify
                additional properties for the integration source.
                These properties provide configuration options that
                can be used to customize the behavior of the ODB
                source during data integration operations.

                * *(string) --*

                  * *(string) --*

            * **Errors** *(list) --*

              A list of errors associated with the integration.

              * *(dict) --*

                An error associated with a zero-ETL integration.

                * **ErrorCode** *(string) --*

                  The code associated with this error.

                * **ErrorMessage** *(string) --*

                  A message describing the error.

        * **Marker** *(string) --*

          A value that indicates the starting point for the next set
          of response records in a subsequent request.

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.IntegrationNotFoundFault"

   * "Glue.Client.exceptions.TargetResourceNotFound"

   * "Glue.Client.exceptions.OperationNotSupportedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / update_workflow


update_workflow
***************

Glue.Client.update_workflow(**kwargs)

   Updates an existing workflow.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_workflow(
          Name='string',
          Description='string',
          DefaultRunProperties={
              'string': 'string'
          },
          MaxConcurrentRuns=123
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        Name of the workflow to be updated.

      * **Description** (*string*) -- The description of the workflow.

      * **DefaultRunProperties** (*dict*) --

        A collection of properties to be used as part of each
        execution of the workflow.

        Run properties may be logged. Do not pass plaintext secrets as
        properties. Retrieve secrets from a Glue Connection, Amazon
        Web Services Secrets Manager or other secret management
        mechanism if you intend to use them within the workflow run.

        * *(string) --*

          * *(string) --*

      * **MaxConcurrentRuns** (*integer*) -- You can use this
        parameter to prevent unwanted multiple updates to data, to
        control costs, or in some cases, to prevent exceeding the
        maximum number of concurrent runs of any of the component
        jobs. If you leave this parameter blank, there is no limit to
        the number of concurrent workflow runs.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the workflow which was specified in input.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / cancel_ml_task_run


cancel_ml_task_run
******************

Glue.Client.cancel_ml_task_run(**kwargs)

   Cancels (stops) a task run. Machine learning task runs are
   asynchronous tasks that Glue runs on your behalf as part of various
   machine learning workflows. You can cancel a machine learning task
   run at any time by calling "CancelMLTaskRun" with a task run's
   parent transform's "TransformID" and the task run's "TaskRunId".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.cancel_ml_task_run(
          TransformId='string',
          TaskRunId='string'
      )

   Parameters:
      * **TransformId** (*string*) --

        **[REQUIRED]**

        The unique identifier of the machine learning transform.

      * **TaskRunId** (*string*) --

        **[REQUIRED]**

        A unique identifier for the task run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TransformId': 'string',
             'TaskRunId': 'string',
             'Status': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'
         }

      **Response Structure**

      * *(dict) --*

        * **TransformId** *(string) --*

          The unique identifier of the machine learning transform.

        * **TaskRunId** *(string) --*

          The unique identifier for the task run.

        * **Status** *(string) --*

          The status for this run.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / create_integration_table_properties


create_integration_table_properties
***********************************

Glue.Client.create_integration_table_properties(**kwargs)

   This API is used to provide optional override properties for the
   the tables that need to be replicated. These properties can include
   properties for filtering and partitioning for the source and target
   tables. To set both source and target properties the same API need
   to be invoked with the Glue connection ARN as "ResourceArn" with
   "SourceTableConfig", and the Glue database ARN as "ResourceArn"
   with "TargetTableConfig" respectively.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_integration_table_properties(
          ResourceArn='string',
          TableName='string',
          SourceTableConfig={
              'Fields': [
                  'string',
              ],
              'FilterPredicate': 'string',
              'PrimaryKey': [
                  'string',
              ],
              'RecordUpdateField': 'string'
          },
          TargetTableConfig={
              'UnnestSpec': 'TOPLEVEL'|'FULL'|'NOUNNEST',
              'PartitionSpec': [
                  {
                      'FieldName': 'string',
                      'FunctionSpec': 'string',
                      'ConversionSpec': 'string'
                  },
              ],
              'TargetTableName': 'string'
          }
      )

   Parameters:
      * **ResourceArn** (*string*) --

        **[REQUIRED]**

        The Amazon Resource Name (ARN) of the target table for which
        to create integration table properties. Currently, this API
        only supports creating integration table properties for target
        tables, and the provided ARN should be the ARN of the target
        table in the Glue Data Catalog. Support for creating
        integration table properties for source connections (using the
        connection ARN) is not yet implemented and will be added in a
        future release.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table to be replicated.

      * **SourceTableConfig** (*dict*) --

        A structure for the source table configuration. See the
        "SourceTableConfig" structure to see list of supported source
        properties.

        * **Fields** *(list) --*

          A list of fields used for column-level filtering. Currently
          unsupported.

          * *(string) --*

        * **FilterPredicate** *(string) --*

          A condition clause used for row-level filtering. Currently
          unsupported.

        * **PrimaryKey** *(list) --*

          Provide the primary key set for this table. Currently
          supported specifically for SAP "EntityOf" entities upon
          request. Contact Amazon Web Services Support to make this
          feature available.

          * *(string) --*

        * **RecordUpdateField** *(string) --*

          Incremental pull timestamp-based field. Currently
          unsupported.

      * **TargetTableConfig** (*dict*) --

        A structure for the target table configuration.

        * **UnnestSpec** *(string) --*

          Specifies how nested objects are flattened to top-level
          elements. Valid values are: "TOPLEVEL", "FULL", or
          "NOUNNEST".

        * **PartitionSpec** *(list) --*

          Determines the file layout on the target.

          * *(dict) --*

            A structure that describes how data is partitioned on the
            target.

            * **FieldName** *(string) --*

              The field name used to partition data on the target.
              Avoid using columns that have unique values for each row
              (for example, *LastModifiedTimestamp*,
              *SystemModTimeStamp*) as the partition column. These
              columns are not suitable for partitioning because they
              create a large number of small partitions, which can
              lead to performance issues.

            * **FunctionSpec** *(string) --*

              Specifies the function used to partition data on the
              target. The only accepted value for this parameter is
              *'identity'* (string). The *'identity'* function ensures
              that the data partitioning on the target follows the
              same scheme as the source. In other words, the
              partitioning structure of the source data is preserved
              in the target destination.

            * **ConversionSpec** *(string) --*

              Specifies the timestamp format of the source data. Valid
              values are:

              * "epoch_sec" - Unix epoch timestamp in seconds

              * "epoch_milli" - Unix epoch timestamp in milliseconds

              * "iso" - ISO 8601 formatted timestamp

              Note:

                Only specify "ConversionSpec" when using timestamp-
                based partition functions (year, month, day, or hour).
                Glue Zero-ETL uses this parameter to correctly
                transform source data into timestamp format before
                partitioning.Do not use high-cardinality columns with
                the "identity" partition function. High-cardinality
                columns include:

                * Primary keys

                * Timestamp fields (such as "LastModifiedTimestamp",
                  "CreatedDate")

                * System-generated timestamps

                Using high-cardinality columns with identity
                partitioning creates many small partitions, which can
                significantly degrade ingestion performance.

        * **TargetTableName** *(string) --*

          The optional name of a target table.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ResourceNotFoundException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / batch_delete_connection


batch_delete_connection
***********************

Glue.Client.batch_delete_connection(**kwargs)

   Deletes a list of connection definitions from the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_delete_connection(
          CatalogId='string',
          ConnectionNameList=[
              'string',
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which the connections reside. If none is provided, the Amazon
        Web Services account ID is used by default.

      * **ConnectionNameList** (*list*) --

        **[REQUIRED]**

        A list of names of the connections to delete.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Succeeded': [
                 'string',
             ],
             'Errors': {
                 'string': {
                     'ErrorCode': 'string',
                     'ErrorMessage': 'string'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Succeeded** *(list) --*

          A list of names of the connection definitions that were
          successfully deleted.

          * *(string) --*

        * **Errors** *(dict) --*

          A map of the names of connections that were not successfully
          deleted to error details.

          * *(string) --*

            * *(dict) --*

              Contains details about an error.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_workflow_runs


get_workflow_runs
*****************

Glue.Client.get_workflow_runs(**kwargs)

   Retrieves metadata for all runs of a given workflow.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_workflow_runs(
          Name='string',
          IncludeGraph=True|False,
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        Name of the workflow whose metadata of runs should be
        returned.

      * **IncludeGraph** (*boolean*) -- Specifies whether to include
        the workflow graph in response or not.

      * **NextToken** (*string*) -- The maximum size of the response.

      * **MaxResults** (*integer*) -- The maximum number of workflow
        runs to be included in the response.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Runs': [
                 {
                     'Name': 'string',
                     'WorkflowRunId': 'string',
                     'PreviousRunId': 'string',
                     'WorkflowRunProperties': {
                         'string': 'string'
                     },
                     'StartedOn': datetime(2015, 1, 1),
                     'CompletedOn': datetime(2015, 1, 1),
                     'Status': 'RUNNING'|'COMPLETED'|'STOPPING'|'STOPPED'|'ERROR',
                     'ErrorMessage': 'string',
                     'Statistics': {
                         'TotalActions': 123,
                         'TimeoutActions': 123,
                         'FailedActions': 123,
                         'StoppedActions': 123,
                         'SucceededActions': 123,
                         'RunningActions': 123,
                         'ErroredActions': 123,
                         'WaitingActions': 123
                     },
                     'Graph': {
                         'Nodes': [
                             {
                                 'Type': 'CRAWLER'|'JOB'|'TRIGGER',
                                 'Name': 'string',
                                 'UniqueId': 'string',
                                 'TriggerDetails': {
                                     'Trigger': {
                                         'Name': 'string',
                                         'WorkflowName': 'string',
                                         'Id': 'string',
                                         'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                                         'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                                         'Description': 'string',
                                         'Schedule': 'string',
                                         'Actions': [
                                             {
                                                 'JobName': 'string',
                                                 'Arguments': {
                                                     'string': 'string'
                                                 },
                                                 'Timeout': 123,
                                                 'SecurityConfiguration': 'string',
                                                 'NotificationProperty': {
                                                     'NotifyDelayAfter': 123
                                                 },
                                                 'CrawlerName': 'string'
                                             },
                                         ],
                                         'Predicate': {
                                             'Logical': 'AND'|'ANY',
                                             'Conditions': [
                                                 {
                                                     'LogicalOperator': 'EQUALS',
                                                     'JobName': 'string',
                                                     'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                                     'CrawlerName': 'string',
                                                     'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                                                 },
                                             ]
                                         },
                                         'EventBatchingCondition': {
                                             'BatchSize': 123,
                                             'BatchWindow': 123
                                         }
                                     }
                                 },
                                 'JobDetails': {
                                     'JobRuns': [
                                         {
                                             'Id': 'string',
                                             'Attempt': 123,
                                             'PreviousRunId': 'string',
                                             'TriggerName': 'string',
                                             'JobName': 'string',
                                             'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                                             'JobRunQueuingEnabled': True|False,
                                             'StartedOn': datetime(2015, 1, 1),
                                             'LastModifiedOn': datetime(2015, 1, 1),
                                             'CompletedOn': datetime(2015, 1, 1),
                                             'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                             'Arguments': {
                                                 'string': 'string'
                                             },
                                             'ErrorMessage': 'string',
                                             'PredecessorRuns': [
                                                 {
                                                     'JobName': 'string',
                                                     'RunId': 'string'
                                                 },
                                             ],
                                             'AllocatedCapacity': 123,
                                             'ExecutionTime': 123,
                                             'Timeout': 123,
                                             'MaxCapacity': 123.0,
                                             'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                                             'NumberOfWorkers': 123,
                                             'SecurityConfiguration': 'string',
                                             'LogGroupName': 'string',
                                             'NotificationProperty': {
                                                 'NotifyDelayAfter': 123
                                             },
                                             'GlueVersion': 'string',
                                             'DPUSeconds': 123.0,
                                             'ExecutionClass': 'FLEX'|'STANDARD',
                                             'MaintenanceWindow': 'string',
                                             'ProfileName': 'string',
                                             'StateDetail': 'string',
                                             'ExecutionRoleSessionPolicy': 'string'
                                         },
                                     ]
                                 },
                                 'CrawlerDetails': {
                                     'Crawls': [
                                         {
                                             'State': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR',
                                             'StartedOn': datetime(2015, 1, 1),
                                             'CompletedOn': datetime(2015, 1, 1),
                                             'ErrorMessage': 'string',
                                             'LogGroup': 'string',
                                             'LogStream': 'string'
                                         },
                                     ]
                                 }
                             },
                         ],
                         'Edges': [
                             {
                                 'SourceId': 'string',
                                 'DestinationId': 'string'
                             },
                         ]
                     },
                     'StartingEventBatchCondition': {
                         'BatchSize': 123,
                         'BatchWindow': 123
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Runs** *(list) --*

          A list of workflow run metadata objects.

          * *(dict) --*

            A workflow run is an execution of a workflow providing all
            the runtime information.

            * **Name** *(string) --*

              Name of the workflow that was run.

            * **WorkflowRunId** *(string) --*

              The ID of this workflow run.

            * **PreviousRunId** *(string) --*

              The ID of the previous workflow run.

            * **WorkflowRunProperties** *(dict) --*

              The workflow run properties which were set during the
              run.

              * *(string) --*

                * *(string) --*

            * **StartedOn** *(datetime) --*

              The date and time when the workflow run was started.

            * **CompletedOn** *(datetime) --*

              The date and time when the workflow run completed.

            * **Status** *(string) --*

              The status of the workflow run.

            * **ErrorMessage** *(string) --*

              This error message describes any error that may have
              occurred in starting the workflow run. Currently the
              only error message is "Concurrent runs exceeded for
              workflow: "foo"."

            * **Statistics** *(dict) --*

              The statistics of the run.

              * **TotalActions** *(integer) --*

                Total number of Actions in the workflow run.

              * **TimeoutActions** *(integer) --*

                Total number of Actions that timed out.

              * **FailedActions** *(integer) --*

                Total number of Actions that have failed.

              * **StoppedActions** *(integer) --*

                Total number of Actions that have stopped.

              * **SucceededActions** *(integer) --*

                Total number of Actions that have succeeded.

              * **RunningActions** *(integer) --*

                Total number Actions in running state.

              * **ErroredActions** *(integer) --*

                Indicates the count of job runs in the ERROR state in
                the workflow run.

              * **WaitingActions** *(integer) --*

                Indicates the count of job runs in WAITING state in
                the workflow run.

            * **Graph** *(dict) --*

              The graph representing all the Glue components that
              belong to the workflow as nodes and directed connections
              between them as edges.

              * **Nodes** *(list) --*

                A list of the the Glue components belong to the
                workflow represented as nodes.

                * *(dict) --*

                  A node represents an Glue component (trigger,
                  crawler, or job) on a workflow graph.

                  * **Type** *(string) --*

                    The type of Glue component represented by the
                    node.

                  * **Name** *(string) --*

                    The name of the Glue component represented by the
                    node.

                  * **UniqueId** *(string) --*

                    The unique Id assigned to the node within the
                    workflow.

                  * **TriggerDetails** *(dict) --*

                    Details of the Trigger when the node represents a
                    Trigger.

                    * **Trigger** *(dict) --*

                      The information of the trigger represented by
                      the trigger node.

                      * **Name** *(string) --*

                        The name of the trigger.

                      * **WorkflowName** *(string) --*

                        The name of the workflow associated with the
                        trigger.

                      * **Id** *(string) --*

                        Reserved for future use.

                      * **Type** *(string) --*

                        The type of trigger that this is.

                      * **State** *(string) --*

                        The current state of the trigger.

                      * **Description** *(string) --*

                        A description of this trigger.

                      * **Schedule** *(string) --*

                        A "cron" expression used to specify the
                        schedule (see Time-Based Schedules for Jobs
                        and Crawlers. For example, to run something
                        every day at 12:15 UTC, you would specify:
                        "cron(15 12 * * ? *)".

                      * **Actions** *(list) --*

                        The actions initiated by this trigger.

                        * *(dict) --*

                          Defines an action to be initiated by a
                          trigger.

                          * **JobName** *(string) --*

                            The name of a job to be run.

                          * **Arguments** *(dict) --*

                            The job arguments used when this trigger
                            fires. For this job run, they replace the
                            default arguments set in the job
                            definition itself.

                            You can specify arguments here that your
                            own job-execution script consumes, as well
                            as arguments that Glue itself consumes.

                            For information about how to specify and
                            consume your own Job arguments, see the
                            Calling Glue APIs in Python topic in the
                            developer guide.

                            For information about the key-value pairs
                            that Glue consumes to set up your job, see
                            the Special Parameters Used by Glue topic
                            in the developer guide.

                            * *(string) --*

                              * *(string) --*

                          * **Timeout** *(integer) --*

                            The "JobRun" timeout in minutes. This is
                            the maximum time that a job run can
                            consume resources before it is terminated
                            and enters "TIMEOUT" status. This
                            overrides the timeout value set in the
                            parent job.

                            Jobs must have timeout values less than 7
                            days or 10080 minutes. Otherwise, the jobs
                            will throw an exception.

                            When the value is left blank, the timeout
                            is defaulted to 2880 minutes.

                            Any existing Glue jobs that had a timeout
                            value greater than 7 days will be
                            defaulted to 7 days. For instance if you
                            have specified a timeout of 20 days for a
                            batch job, it will be stopped on the 7th
                            day.

                            For streaming jobs, if you have set up a
                            maintenance window, it will be restarted
                            during the maintenance window after 7
                            days.

                          * **SecurityConfiguration** *(string) --*

                            The name of the "SecurityConfiguration"
                            structure to be used with this action.

                          * **NotificationProperty** *(dict) --*

                            Specifies configuration properties of a
                            job run notification.

                            * **NotifyDelayAfter** *(integer) --*

                              After a job run starts, the number of
                              minutes to wait before sending a job run
                              delay notification.

                          * **CrawlerName** *(string) --*

                            The name of the crawler to be used with
                            this action.

                      * **Predicate** *(dict) --*

                        The predicate of this trigger, which defines
                        when it will fire.

                        * **Logical** *(string) --*

                          An optional field if only one condition is
                          listed. If multiple conditions are listed,
                          then this field is required.

                        * **Conditions** *(list) --*

                          A list of the conditions that determine when
                          the trigger will fire.

                          * *(dict) --*

                            Defines a condition under which a trigger
                            fires.

                            * **LogicalOperator** *(string) --*

                              A logical operator.

                            * **JobName** *(string) --*

                              The name of the job whose "JobRuns" this
                              condition applies to, and on which this
                              trigger waits.

                            * **State** *(string) --*

                              The condition state. Currently, the only
                              job states that a trigger can listen for
                              are "SUCCEEDED", "STOPPED", "FAILED",
                              and "TIMEOUT". The only crawler states
                              that a trigger can listen for are
                              "SUCCEEDED", "FAILED", and "CANCELLED".

                            * **CrawlerName** *(string) --*

                              The name of the crawler to which this
                              condition applies.

                            * **CrawlState** *(string) --*

                              The state of the crawler to which this
                              condition applies.

                      * **EventBatchingCondition** *(dict) --*

                        Batch condition that must be met (specified
                        number of events received or batch time window
                        expired) before EventBridge event trigger
                        fires.

                        * **BatchSize** *(integer) --*

                          Number of events that must be received from
                          Amazon EventBridge before EventBridge event
                          trigger fires.

                        * **BatchWindow** *(integer) --*

                          Window of time in seconds after which
                          EventBridge event trigger fires. Window
                          starts when first event is received.

                  * **JobDetails** *(dict) --*

                    Details of the Job when the node represents a Job.

                    * **JobRuns** *(list) --*

                      The information for the job runs represented by
                      the job node.

                      * *(dict) --*

                        Contains information about a job run.

                        * **Id** *(string) --*

                          The ID of this job run.

                        * **Attempt** *(integer) --*

                          The number of the attempt to run this job.

                        * **PreviousRunId** *(string) --*

                          The ID of the previous run of this job. For
                          example, the "JobRunId" specified in the
                          "StartJobRun" action.

                        * **TriggerName** *(string) --*

                          The name of the trigger that started this
                          job run.

                        * **JobName** *(string) --*

                          The name of the job definition being used in
                          this run.

                        * **JobMode** *(string) --*

                          A mode that describes how a job was created.
                          Valid values are:

                          * "SCRIPT" - The job was created using the
                            Glue Studio script editor.

                          * "VISUAL" - The job was created using the
                            Glue Studio visual editor.

                          * "NOTEBOOK" - The job was created using an
                            interactive sessions notebook.

                          When the "JobMode" field is missing or null,
                          "SCRIPT" is assigned as the default value.

                        * **JobRunQueuingEnabled** *(boolean) --*

                          Specifies whether job run queuing is enabled
                          for the job run.

                          A value of true means job run queuing is
                          enabled for the job run. If false or not
                          populated, the job run will not be
                          considered for queueing.

                        * **StartedOn** *(datetime) --*

                          The date and time at which this job run was
                          started.

                        * **LastModifiedOn** *(datetime) --*

                          The last time that this job run was
                          modified.

                        * **CompletedOn** *(datetime) --*

                          The date and time that this job run
                          completed.

                        * **JobRunState** *(string) --*

                          The current state of the job run. For more
                          information about the statuses of jobs that
                          have terminated abnormally, see Glue Job Run
                          Statuses.

                        * **Arguments** *(dict) --*

                          The job arguments associated with this run.
                          For this job run, they replace the default
                          arguments set in the job definition itself.

                          You can specify arguments here that your own
                          job-execution script consumes, as well as
                          arguments that Glue itself consumes.

                          Job arguments may be logged. Do not pass
                          plaintext secrets as arguments. Retrieve
                          secrets from a Glue Connection, Secrets
                          Manager or other secret management mechanism
                          if you intend to keep them within the Job.

                          For information about how to specify and
                          consume your own Job arguments, see the
                          Calling Glue APIs in Python topic in the
                          developer guide.

                          For information about the arguments you can
                          provide to this field when configuring Spark
                          jobs, see the Special Parameters Used by
                          Glue topic in the developer guide.

                          For information about the arguments you can
                          provide to this field when configuring Ray
                          jobs, see Using job parameters in Ray jobs
                          in the developer guide.

                          * *(string) --*

                            * *(string) --*

                        * **ErrorMessage** *(string) --*

                          An error message associated with this job
                          run.

                        * **PredecessorRuns** *(list) --*

                          A list of predecessors to this job run.

                          * *(dict) --*

                            A job run that was used in the predicate
                            of a conditional trigger that triggered
                            this job run.

                            * **JobName** *(string) --*

                              The name of the job definition used by
                              the predecessor job run.

                            * **RunId** *(string) --*

                              The job-run ID of the predecessor job
                              run.

                        * **AllocatedCapacity** *(integer) --*

                          This field is deprecated. Use "MaxCapacity"
                          instead.

                          The number of Glue data processing units
                          (DPUs) allocated to this JobRun. From 2 to
                          100 DPUs can be allocated; the default is
                          10. A DPU is a relative measure of
                          processing power that consists of 4 vCPUs of
                          compute capacity and 16 GB of memory. For
                          more information, see the Glue pricing page.

                        * **ExecutionTime** *(integer) --*

                          The amount of time (in seconds) that the job
                          run consumed resources.

                        * **Timeout** *(integer) --*

                          The "JobRun" timeout in minutes. This is the
                          maximum time that a job run can consume
                          resources before it is terminated and enters
                          "TIMEOUT" status. This value overrides the
                          timeout value set in the parent job.

                          Jobs must have timeout values less than 7
                          days or 10080 minutes. Otherwise, the jobs
                          will throw an exception.

                          When the value is left blank, the timeout is
                          defaulted to 2880 minutes.

                          Any existing Glue jobs that had a timeout
                          value greater than 7 days will be defaulted
                          to 7 days. For instance if you have
                          specified a timeout of 20 days for a batch
                          job, it will be stopped on the 7th day.

                          For streaming jobs, if you have set up a
                          maintenance window, it will be restarted
                          during the maintenance window after 7 days.

                        * **MaxCapacity** *(float) --*

                          For Glue version 1.0 or earlier jobs, using
                          the standard worker type, the number of Glue
                          data processing units (DPUs) that can be
                          allocated when this job runs. A DPU is a
                          relative measure of processing power that
                          consists of 4 vCPUs of compute capacity and
                          16 GB of memory. For more information, see
                          the Glue pricing page.

                          For Glue version 2.0+ jobs, you cannot
                          specify a "Maximum capacity". Instead, you
                          should specify a "Worker type" and the
                          "Number of workers".

                          Do not set "MaxCapacity" if using
                          "WorkerType" and "NumberOfWorkers".

                          The value that can be allocated for
                          "MaxCapacity" depends on whether you are
                          running a Python shell job, an Apache Spark
                          ETL job, or an Apache Spark streaming ETL
                          job:

                          * When you specify a Python shell job (
                            >>``<<JobCommand.Name``="pythonshell"),
                            you can allocate either 0.0625 or 1 DPU.
                            The default is 0.0625 DPU.

                          * When you specify an Apache Spark ETL job (
                            >>``<<JobCommand.Name``="glueetl") or
                            Apache Spark streaming ETL job (
                            >>``<<JobCommand.Name``="gluestreaming"),
                            you can allocate from 2 to 100 DPUs. The
                            default is 10 DPUs. This job type cannot
                            have a fractional DPU allocation.

                        * **WorkerType** *(string) --*

                          The type of predefined worker that is
                          allocated when a job runs. Accepts a value
                          of G.1X, G.2X, G.4X, G.8X or G.025X for
                          Spark jobs. Accepts the value Z.2X for Ray
                          jobs.

                          * For the "G.1X" worker type, each worker
                            maps to 1 DPU (4 vCPUs, 16 GB of memory)
                            with 94GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for workloads such as data transforms,
                            joins, and queries, to offers a scalable
                            and cost effective way to run most jobs.

                          * For the "G.2X" worker type, each worker
                            maps to 2 DPU (8 vCPUs, 32 GB of memory)
                            with 138GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for workloads such as data transforms,
                            joins, and queries, to offers a scalable
                            and cost effective way to run most jobs.

                          * For the "G.4X" worker type, each worker
                            maps to 4 DPU (16 vCPUs, 64 GB of memory)
                            with 256GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for jobs whose workloads contain your most
                            demanding transforms, aggregations, joins,
                            and queries. This worker type is available
                            only for Glue version 3.0 or later Spark
                            ETL jobs in the following Amazon Web
                            Services Regions: US East (Ohio), US East
                            (N. Virginia), US West (Oregon), Asia
                            Pacific (Singapore), Asia Pacific
                            (Sydney), Asia Pacific (Tokyo), Canada
                            (Central), Europe (Frankfurt), Europe
                            (Ireland), and Europe (Stockholm).

                          * For the "G.8X" worker type, each worker
                            maps to 8 DPU (32 vCPUs, 128 GB of memory)
                            with 512GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for jobs whose workloads contain your most
                            demanding transforms, aggregations, joins,
                            and queries. This worker type is available
                            only for Glue version 3.0 or later Spark
                            ETL jobs, in the same Amazon Web Services
                            Regions as supported for the "G.4X" worker
                            type.

                          * For the "G.025X" worker type, each worker
                            maps to 0.25 DPU (2 vCPUs, 4 GB of memory)
                            with 84GB disk, and provides 1 executor
                            per worker. We recommend this worker type
                            for low volume streaming jobs. This worker
                            type is only available for Glue version
                            3.0 or later streaming jobs.

                          * For the "Z.2X" worker type, each worker
                            maps to 2 M-DPU (8vCPUs, 64 GB of memory)
                            with 128 GB disk, and provides up to 8 Ray
                            workers based on the autoscaler.

                        * **NumberOfWorkers** *(integer) --*

                          The number of workers of a defined
                          "workerType" that are allocated when a job
                          runs.

                        * **SecurityConfiguration** *(string) --*

                          The name of the "SecurityConfiguration"
                          structure to be used with this job run.

                        * **LogGroupName** *(string) --*

                          The name of the log group for secure logging
                          that can be server-side encrypted in Amazon
                          CloudWatch using KMS. This name can be
                          "/aws-glue/jobs/", in which case the default
                          encryption is "NONE". If you add a role name
                          and "SecurityConfiguration" name (in other
                          words, "/aws-glue/jobs-yourRoleName-
                          yourSecurityConfigurationName/"), then that
                          security configuration is used to encrypt
                          the log group.

                        * **NotificationProperty** *(dict) --*

                          Specifies configuration properties of a job
                          run notification.

                          * **NotifyDelayAfter** *(integer) --*

                            After a job run starts, the number of
                            minutes to wait before sending a job run
                            delay notification.

                        * **GlueVersion** *(string) --*

                          In Spark jobs, "GlueVersion" determines the
                          versions of Apache Spark and Python that
                          Glue available in a job. The Python version
                          indicates the version supported for jobs of
                          type Spark.

                          Ray jobs should set "GlueVersion" to "4.0"
                          or greater. However, the versions of Ray,
                          Python and additional libraries available in
                          your Ray job are determined by the "Runtime"
                          parameter of the Job command.

                          For more information about the available
                          Glue versions and corresponding Spark and
                          Python versions, see Glue version in the
                          developer guide.

                          Jobs that are created without specifying a
                          Glue version default to Glue 0.9.

                        * **DPUSeconds** *(float) --*

                          This field can be set for either job runs
                          with execution class "FLEX" or when Auto
                          Scaling is enabled, and represents the total
                          time each executor ran during the lifecycle
                          of a job run in seconds, multiplied by a DPU
                          factor (1 for "G.1X", 2 for "G.2X", or 0.25
                          for "G.025X" workers). This value may be
                          different than the "executionEngineRuntime"
                          * "MaxCapacity" as in the case of Auto
                          Scaling jobs, as the number of executors
                          running at a given time may be less than the
                          "MaxCapacity". Therefore, it is possible
                          that the value of "DPUSeconds" is less than
                          "executionEngineRuntime" * "MaxCapacity".

                        * **ExecutionClass** *(string) --*

                          Indicates whether the job is run with a
                          standard or flexible execution class. The
                          standard execution-class is ideal for time-
                          sensitive workloads that require fast job
                          startup and dedicated resources.

                          The flexible execution class is appropriate
                          for time-insensitive jobs whose start and
                          completion times may vary.

                          Only jobs with Glue version 3.0 and above
                          and command type "glueetl" will be allowed
                          to set "ExecutionClass" to "FLEX". The
                          flexible execution class is available for
                          Spark jobs.

                        * **MaintenanceWindow** *(string) --*

                          This field specifies a day of the week and
                          hour for a maintenance window for streaming
                          jobs. Glue periodically performs maintenance
                          activities. During these maintenance
                          windows, Glue will need to restart your
                          streaming jobs.

                          Glue will restart the job within 3 hours of
                          the specified maintenance window. For
                          instance, if you set up the maintenance
                          window for Monday at 10:00AM GMT, your jobs
                          will be restarted between 10:00AM GMT to
                          1:00PM GMT.

                        * **ProfileName** *(string) --*

                          The name of an Glue usage profile associated
                          with the job run.

                        * **StateDetail** *(string) --*

                          This field holds details that pertain to the
                          state of a job run. The field is nullable.

                          For example, when a job run is in a WAITING
                          state as a result of job run queuing, the
                          field has the reason why the job run is in
                          that state.

                        * **ExecutionRoleSessionPolicy** *(string) --*

                          This inline session policy to the
                          StartJobRun API allows you to dynamically
                          restrict the permissions of the specified
                          execution role for the scope of the job,
                          without requiring the creation of additional
                          IAM roles.

                  * **CrawlerDetails** *(dict) --*

                    Details of the crawler when the node represents a
                    crawler.

                    * **Crawls** *(list) --*

                      A list of crawls represented by the crawl node.

                      * *(dict) --*

                        The details of a crawl in the workflow.

                        * **State** *(string) --*

                          The state of the crawler.

                        * **StartedOn** *(datetime) --*

                          The date and time on which the crawl
                          started.

                        * **CompletedOn** *(datetime) --*

                          The date and time on which the crawl
                          completed.

                        * **ErrorMessage** *(string) --*

                          The error message associated with the crawl.

                        * **LogGroup** *(string) --*

                          The log group associated with the crawl.

                        * **LogStream** *(string) --*

                          The log stream associated with the crawl.

              * **Edges** *(list) --*

                A list of all the directed connections between the
                nodes belonging to the workflow.

                * *(dict) --*

                  An edge represents a directed connection between two
                  Glue components that are part of the workflow the
                  edge belongs to.

                  * **SourceId** *(string) --*

                    The unique of the node within the workflow where
                    the edge starts.

                  * **DestinationId** *(string) --*

                    The unique of the node within the workflow where
                    the edge ends.

            * **StartingEventBatchCondition** *(dict) --*

              The batch condition that started the workflow run.

              * **BatchSize** *(integer) --*

                Number of events in the batch.

              * **BatchWindow** *(integer) --*

                Duration of the batch window in seconds.

        * **NextToken** *(string) --*

          A continuation token, if not all requested workflow runs
          have been returned.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / create_job


create_job
**********

Glue.Client.create_job(**kwargs)

   Creates a new job definition.

   See also: AWS API Documentation

      **Request Syntax**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

      **Parameters**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The unique name that was provided for this job definition.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.IdempotentParameterMismatchException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / get_data_quality_model_result


get_data_quality_model_result
*****************************

Glue.Client.get_data_quality_model_result(**kwargs)

   Retrieve a statistic's predictions for a given Profile ID.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_data_quality_model_result(
          StatisticId='string',
          ProfileId='string'
      )

   Parameters:
      * **StatisticId** (*string*) --

        **[REQUIRED]**

        The Statistic ID.

      * **ProfileId** (*string*) --

        **[REQUIRED]**

        The Profile ID.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'CompletedOn': datetime(2015, 1, 1),
             'Model': [
                 {
                     'LowerBound': 123.0,
                     'UpperBound': 123.0,
                     'PredictedValue': 123.0,
                     'ActualValue': 123.0,
                     'Date': datetime(2015, 1, 1),
                     'InclusionAnnotation': 'INCLUDE'|'EXCLUDE'
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **CompletedOn** *(datetime) --*

          The timestamp when the data quality model training
          completed.

        * **Model** *(list) --*

          A list of "StatisticModelResult"

          * *(dict) --*

            The statistic model result.

            * **LowerBound** *(float) --*

              The lower bound.

            * **UpperBound** *(float) --*

              The upper bound.

            * **PredictedValue** *(float) --*

              The predicted value.

            * **ActualValue** *(float) --*

              The actual value.

            * **Date** *(datetime) --*

              The date.

            * **InclusionAnnotation** *(string) --*

              The inclusion annotation.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / delete_security_configuration


delete_security_configuration
*****************************

Glue.Client.delete_security_configuration(**kwargs)

   Deletes a specified security configuration.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_security_configuration(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the security configuration to delete.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_column_statistics_task_runs


get_column_statistics_task_runs
*******************************

Glue.Client.get_column_statistics_task_runs(**kwargs)

   Retrieves information about all runs associated with the specified
   table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_column_statistics_task_runs(
          DatabaseName='string',
          TableName='string',
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database where the table resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table.

      * **MaxResults** (*integer*) -- The maximum size of the
        response.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ColumnStatisticsTaskRuns': [
                 {
                     'CustomerId': 'string',
                     'ColumnStatisticsTaskRunId': 'string',
                     'DatabaseName': 'string',
                     'TableName': 'string',
                     'ColumnNameList': [
                         'string',
                     ],
                     'CatalogID': 'string',
                     'Role': 'string',
                     'SampleSize': 123.0,
                     'SecurityConfiguration': 'string',
                     'NumberOfWorkers': 123,
                     'WorkerType': 'string',
                     'ComputationType': 'FULL'|'INCREMENTAL',
                     'Status': 'STARTING'|'RUNNING'|'SUCCEEDED'|'FAILED'|'STOPPED',
                     'CreationTime': datetime(2015, 1, 1),
                     'LastUpdated': datetime(2015, 1, 1),
                     'StartTime': datetime(2015, 1, 1),
                     'EndTime': datetime(2015, 1, 1),
                     'ErrorMessage': 'string',
                     'DPUSeconds': 123.0
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **ColumnStatisticsTaskRuns** *(list) --*

          A list of column statistics task runs.

          * *(dict) --*

            The object that shows the details of the column stats run.

            * **CustomerId** *(string) --*

              The Amazon Web Services account ID.

            * **ColumnStatisticsTaskRunId** *(string) --*

              The identifier for the particular column statistics task
              run.

            * **DatabaseName** *(string) --*

              The database where the table resides.

            * **TableName** *(string) --*

              The name of the table for which column statistics is
              generated.

            * **ColumnNameList** *(list) --*

              A list of the column names. If none is supplied, all
              column names for the table will be used by default.

              * *(string) --*

            * **CatalogID** *(string) --*

              The ID of the Data Catalog where the table resides. If
              none is supplied, the Amazon Web Services account ID is
              used by default.

            * **Role** *(string) --*

              The IAM role that the service assumes to generate
              statistics.

            * **SampleSize** *(float) --*

              The percentage of rows used to generate statistics. If
              none is supplied, the entire table will be used to
              generate stats.

            * **SecurityConfiguration** *(string) --*

              Name of the security configuration that is used to
              encrypt CloudWatch logs for the column stats task run.

            * **NumberOfWorkers** *(integer) --*

              The number of workers used to generate column
              statistics. The job is preconfigured to autoscale up to
              25 instances.

            * **WorkerType** *(string) --*

              The type of workers being used for generating stats. The
              default is "g.1x".

            * **ComputationType** *(string) --*

              The type of column statistics computation.

            * **Status** *(string) --*

              The status of the task run.

            * **CreationTime** *(datetime) --*

              The time that this task was created.

            * **LastUpdated** *(datetime) --*

              The last point in time when this task was modified.

            * **StartTime** *(datetime) --*

              The start time of the task.

            * **EndTime** *(datetime) --*

              The end time of the task.

            * **ErrorMessage** *(string) --*

              The error message for the job.

            * **DPUSeconds** *(float) --*

              The calculated DPU usage in seconds for all autoscaled
              workers.

        * **NextToken** *(string) --*

          A continuation token, if not all task runs have yet been
          returned.

   **Exceptions**

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_job_runs


get_job_runs
************

Glue.Client.get_job_runs(**kwargs)

   Retrieves metadata for all runs of a given job definition.

   "GetJobRuns" returns the job runs in chronological order, with the
   newest jobs returned first.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_job_runs(
          JobName='string',
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **JobName** (*string*) --

        **[REQUIRED]**

        The name of the job definition for which to retrieve all job
        runs.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

      * **MaxResults** (*integer*) -- The maximum size of the
        response.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobRuns': [
                 {
                     'Id': 'string',
                     'Attempt': 123,
                     'PreviousRunId': 'string',
                     'TriggerName': 'string',
                     'JobName': 'string',
                     'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                     'JobRunQueuingEnabled': True|False,
                     'StartedOn': datetime(2015, 1, 1),
                     'LastModifiedOn': datetime(2015, 1, 1),
                     'CompletedOn': datetime(2015, 1, 1),
                     'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                     'Arguments': {
                         'string': 'string'
                     },
                     'ErrorMessage': 'string',
                     'PredecessorRuns': [
                         {
                             'JobName': 'string',
                             'RunId': 'string'
                         },
                     ],
                     'AllocatedCapacity': 123,
                     'ExecutionTime': 123,
                     'Timeout': 123,
                     'MaxCapacity': 123.0,
                     'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                     'NumberOfWorkers': 123,
                     'SecurityConfiguration': 'string',
                     'LogGroupName': 'string',
                     'NotificationProperty': {
                         'NotifyDelayAfter': 123
                     },
                     'GlueVersion': 'string',
                     'DPUSeconds': 123.0,
                     'ExecutionClass': 'FLEX'|'STANDARD',
                     'MaintenanceWindow': 'string',
                     'ProfileName': 'string',
                     'StateDetail': 'string',
                     'ExecutionRoleSessionPolicy': 'string'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **JobRuns** *(list) --*

          A list of job-run metadata objects.

          * *(dict) --*

            Contains information about a job run.

            * **Id** *(string) --*

              The ID of this job run.

            * **Attempt** *(integer) --*

              The number of the attempt to run this job.

            * **PreviousRunId** *(string) --*

              The ID of the previous run of this job. For example, the
              "JobRunId" specified in the "StartJobRun" action.

            * **TriggerName** *(string) --*

              The name of the trigger that started this job run.

            * **JobName** *(string) --*

              The name of the job definition being used in this run.

            * **JobMode** *(string) --*

              A mode that describes how a job was created. Valid
              values are:

              * "SCRIPT" - The job was created using the Glue Studio
                script editor.

              * "VISUAL" - The job was created using the Glue Studio
                visual editor.

              * "NOTEBOOK" - The job was created using an interactive
                sessions notebook.

              When the "JobMode" field is missing or null, "SCRIPT" is
              assigned as the default value.

            * **JobRunQueuingEnabled** *(boolean) --*

              Specifies whether job run queuing is enabled for the job
              run.

              A value of true means job run queuing is enabled for the
              job run. If false or not populated, the job run will not
              be considered for queueing.

            * **StartedOn** *(datetime) --*

              The date and time at which this job run was started.

            * **LastModifiedOn** *(datetime) --*

              The last time that this job run was modified.

            * **CompletedOn** *(datetime) --*

              The date and time that this job run completed.

            * **JobRunState** *(string) --*

              The current state of the job run. For more information
              about the statuses of jobs that have terminated
              abnormally, see Glue Job Run Statuses.

            * **Arguments** *(dict) --*

              The job arguments associated with this run. For this job
              run, they replace the default arguments set in the job
              definition itself.

              You can specify arguments here that your own job-
              execution script consumes, as well as arguments that
              Glue itself consumes.

              Job arguments may be logged. Do not pass plaintext
              secrets as arguments. Retrieve secrets from a Glue
              Connection, Secrets Manager or other secret management
              mechanism if you intend to keep them within the Job.

              For information about how to specify and consume your
              own Job arguments, see the Calling Glue APIs in Python
              topic in the developer guide.

              For information about the arguments you can provide to
              this field when configuring Spark jobs, see the Special
              Parameters Used by Glue topic in the developer guide.

              For information about the arguments you can provide to
              this field when configuring Ray jobs, see Using job
              parameters in Ray jobs in the developer guide.

              * *(string) --*

                * *(string) --*

            * **ErrorMessage** *(string) --*

              An error message associated with this job run.

            * **PredecessorRuns** *(list) --*

              A list of predecessors to this job run.

              * *(dict) --*

                A job run that was used in the predicate of a
                conditional trigger that triggered this job run.

                * **JobName** *(string) --*

                  The name of the job definition used by the
                  predecessor job run.

                * **RunId** *(string) --*

                  The job-run ID of the predecessor job run.

            * **AllocatedCapacity** *(integer) --*

              This field is deprecated. Use "MaxCapacity" instead.

              The number of Glue data processing units (DPUs)
              allocated to this JobRun. From 2 to 100 DPUs can be
              allocated; the default is 10. A DPU is a relative
              measure of processing power that consists of 4 vCPUs of
              compute capacity and 16 GB of memory. For more
              information, see the Glue pricing page.

            * **ExecutionTime** *(integer) --*

              The amount of time (in seconds) that the job run
              consumed resources.

            * **Timeout** *(integer) --*

              The "JobRun" timeout in minutes. This is the maximum
              time that a job run can consume resources before it is
              terminated and enters "TIMEOUT" status. This value
              overrides the timeout value set in the parent job.

              Jobs must have timeout values less than 7 days or 10080
              minutes. Otherwise, the jobs will throw an exception.

              When the value is left blank, the timeout is defaulted
              to 2880 minutes.

              Any existing Glue jobs that had a timeout value greater
              than 7 days will be defaulted to 7 days. For instance if
              you have specified a timeout of 20 days for a batch job,
              it will be stopped on the 7th day.

              For streaming jobs, if you have set up a maintenance
              window, it will be restarted during the maintenance
              window after 7 days.

            * **MaxCapacity** *(float) --*

              For Glue version 1.0 or earlier jobs, using the standard
              worker type, the number of Glue data processing units
              (DPUs) that can be allocated when this job runs. A DPU
              is a relative measure of processing power that consists
              of 4 vCPUs of compute capacity and 16 GB of memory. For
              more information, see the Glue pricing page.

              For Glue version 2.0+ jobs, you cannot specify a
              "Maximum capacity". Instead, you should specify a
              "Worker type" and the "Number of workers".

              Do not set "MaxCapacity" if using "WorkerType" and
              "NumberOfWorkers".

              The value that can be allocated for "MaxCapacity"
              depends on whether you are running a Python shell job,
              an Apache Spark ETL job, or an Apache Spark streaming
              ETL job:

              * When you specify a Python shell job (
                >>``<<JobCommand.Name``="pythonshell"), you can
                allocate either 0.0625 or 1 DPU. The default is 0.0625
                DPU.

              * When you specify an Apache Spark ETL job (
                >>``<<JobCommand.Name``="glueetl") or Apache Spark
                streaming ETL job (
                >>``<<JobCommand.Name``="gluestreaming"), you can
                allocate from 2 to 100 DPUs. The default is 10 DPUs.
                This job type cannot have a fractional DPU allocation.

            * **WorkerType** *(string) --*

              The type of predefined worker that is allocated when a
              job runs. Accepts a value of G.1X, G.2X, G.4X, G.8X or
              G.025X for Spark jobs. Accepts the value Z.2X for Ray
              jobs.

              * For the "G.1X" worker type, each worker maps to 1 DPU
                (4 vCPUs, 16 GB of memory) with 94GB disk, and
                provides 1 executor per worker. We recommend this
                worker type for workloads such as data transforms,
                joins, and queries, to offers a scalable and cost
                effective way to run most jobs.

              * For the "G.2X" worker type, each worker maps to 2 DPU
                (8 vCPUs, 32 GB of memory) with 138GB disk, and
                provides 1 executor per worker. We recommend this
                worker type for workloads such as data transforms,
                joins, and queries, to offers a scalable and cost
                effective way to run most jobs.

              * For the "G.4X" worker type, each worker maps to 4 DPU
                (16 vCPUs, 64 GB of memory) with 256GB disk, and
                provides 1 executor per worker. We recommend this
                worker type for jobs whose workloads contain your most
                demanding transforms, aggregations, joins, and
                queries. This worker type is available only for Glue
                version 3.0 or later Spark ETL jobs in the following
                Amazon Web Services Regions: US East (Ohio), US East
                (N. Virginia), US West (Oregon), Asia Pacific
                (Singapore), Asia Pacific (Sydney), Asia Pacific
                (Tokyo), Canada (Central), Europe (Frankfurt), Europe
                (Ireland), and Europe (Stockholm).

              * For the "G.8X" worker type, each worker maps to 8 DPU
                (32 vCPUs, 128 GB of memory) with 512GB disk, and
                provides 1 executor per worker. We recommend this
                worker type for jobs whose workloads contain your most
                demanding transforms, aggregations, joins, and
                queries. This worker type is available only for Glue
                version 3.0 or later Spark ETL jobs, in the same
                Amazon Web Services Regions as supported for the
                "G.4X" worker type.

              * For the "G.025X" worker type, each worker maps to 0.25
                DPU (2 vCPUs, 4 GB of memory) with 84GB disk, and
                provides 1 executor per worker. We recommend this
                worker type for low volume streaming jobs. This worker
                type is only available for Glue version 3.0 or later
                streaming jobs.

              * For the "Z.2X" worker type, each worker maps to 2
                M-DPU (8vCPUs, 64 GB of memory) with 128 GB disk, and
                provides up to 8 Ray workers based on the autoscaler.

            * **NumberOfWorkers** *(integer) --*

              The number of workers of a defined "workerType" that are
              allocated when a job runs.

            * **SecurityConfiguration** *(string) --*

              The name of the "SecurityConfiguration" structure to be
              used with this job run.

            * **LogGroupName** *(string) --*

              The name of the log group for secure logging that can be
              server-side encrypted in Amazon CloudWatch using KMS.
              This name can be "/aws-glue/jobs/", in which case the
              default encryption is "NONE". If you add a role name and
              "SecurityConfiguration" name (in other words, "/aws-glue
              /jobs-yourRoleName-yourSecurityConfigurationName/"),
              then that security configuration is used to encrypt the
              log group.

            * **NotificationProperty** *(dict) --*

              Specifies configuration properties of a job run
              notification.

              * **NotifyDelayAfter** *(integer) --*

                After a job run starts, the number of minutes to wait
                before sending a job run delay notification.

            * **GlueVersion** *(string) --*

              In Spark jobs, "GlueVersion" determines the versions of
              Apache Spark and Python that Glue available in a job.
              The Python version indicates the version supported for
              jobs of type Spark.

              Ray jobs should set "GlueVersion" to "4.0" or greater.
              However, the versions of Ray, Python and additional
              libraries available in your Ray job are determined by
              the "Runtime" parameter of the Job command.

              For more information about the available Glue versions
              and corresponding Spark and Python versions, see Glue
              version in the developer guide.

              Jobs that are created without specifying a Glue version
              default to Glue 0.9.

            * **DPUSeconds** *(float) --*

              This field can be set for either job runs with execution
              class "FLEX" or when Auto Scaling is enabled, and
              represents the total time each executor ran during the
              lifecycle of a job run in seconds, multiplied by a DPU
              factor (1 for "G.1X", 2 for "G.2X", or 0.25 for "G.025X"
              workers). This value may be different than the
              "executionEngineRuntime" * "MaxCapacity" as in the case
              of Auto Scaling jobs, as the number of executors running
              at a given time may be less than the "MaxCapacity".
              Therefore, it is possible that the value of "DPUSeconds"
              is less than "executionEngineRuntime" * "MaxCapacity".

            * **ExecutionClass** *(string) --*

              Indicates whether the job is run with a standard or
              flexible execution class. The standard execution-class
              is ideal for time-sensitive workloads that require fast
              job startup and dedicated resources.

              The flexible execution class is appropriate for time-
              insensitive jobs whose start and completion times may
              vary.

              Only jobs with Glue version 3.0 and above and command
              type "glueetl" will be allowed to set "ExecutionClass"
              to "FLEX". The flexible execution class is available for
              Spark jobs.

            * **MaintenanceWindow** *(string) --*

              This field specifies a day of the week and hour for a
              maintenance window for streaming jobs. Glue periodically
              performs maintenance activities. During these
              maintenance windows, Glue will need to restart your
              streaming jobs.

              Glue will restart the job within 3 hours of the
              specified maintenance window. For instance, if you set
              up the maintenance window for Monday at 10:00AM GMT,
              your jobs will be restarted between 10:00AM GMT to
              1:00PM GMT.

            * **ProfileName** *(string) --*

              The name of an Glue usage profile associated with the
              job run.

            * **StateDetail** *(string) --*

              This field holds details that pertain to the state of a
              job run. The field is nullable.

              For example, when a job run is in a WAITING state as a
              result of job run queuing, the field has the reason why
              the job run is in that state.

            * **ExecutionRoleSessionPolicy** *(string) --*

              This inline session policy to the StartJobRun API allows
              you to dynamically restrict the permissions of the
              specified execution role for the scope of the job,
              without requiring the creation of additional IAM roles.

        * **NextToken** *(string) --*

          A continuation token, if not all requested job runs have
          been returned.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / delete_registry


delete_registry
***************

Glue.Client.delete_registry(**kwargs)

   Delete the entire registry including schema and all of its
   versions. To get the status of the delete operation, you can call
   the "GetRegistry" API after the asynchronous call. Deleting a
   registry will deactivate all online operations for the registry
   such as the "UpdateRegistry", "CreateSchema", "UpdateSchema", and
   "RegisterSchemaVersion" APIs.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_registry(
          RegistryId={
              'RegistryName': 'string',
              'RegistryArn': 'string'
          }
      )

   Parameters:
      **RegistryId** (*dict*) --

      **[REQUIRED]**

      This is a wrapper structure that may contain the registry name
      and Amazon Resource Name (ARN).

      * **RegistryName** *(string) --*

        Name of the registry. Used only for lookup. One of
        "RegistryArn" or "RegistryName" has to be provided.

      * **RegistryArn** *(string) --*

        Arn of the registry to be updated. One of "RegistryArn" or
        "RegistryName" has to be provided.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RegistryName': 'string',
             'RegistryArn': 'string',
             'Status': 'AVAILABLE'|'DELETING'
         }

      **Response Structure**

      * *(dict) --*

        * **RegistryName** *(string) --*

          The name of the registry being deleted.

        * **RegistryArn** *(string) --*

          The Amazon Resource Name (ARN) of the registry being
          deleted.

        * **Status** *(string) --*

          The status of the registry. A successful operation will
          return the "Deleting" status.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / get_resource_policies


get_resource_policies
*********************

Glue.Client.get_resource_policies(**kwargs)

   Retrieves the resource policies set on individual resources by
   Resource Access Manager during cross-account permission grants.
   Also retrieves the Data Catalog resource policy.

   If you enabled metadata encryption in Data Catalog settings, and
   you do not have permission on the KMS key, the operation can't
   return the Data Catalog resource policy.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_resource_policies(
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'GetResourcePoliciesResponseList': [
                 {
                     'PolicyInJson': 'string',
                     'PolicyHash': 'string',
                     'CreateTime': datetime(2015, 1, 1),
                     'UpdateTime': datetime(2015, 1, 1)
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **GetResourcePoliciesResponseList** *(list) --*

          A list of the individual resource policies and the account-
          level resource policy.

          * *(dict) --*

            A structure for returning a resource policy.

            * **PolicyInJson** *(string) --*

              Contains the requested policy document, in JSON format.

            * **PolicyHash** *(string) --*

              Contains the hash value associated with this policy.

            * **CreateTime** *(datetime) --*

              The date and time at which the policy was created.

            * **UpdateTime** *(datetime) --*

              The date and time at which the policy was last updated.

        * **NextToken** *(string) --*

          A continuation token, if the returned list does not contain
          the last resource policy available.

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_table_optimizer


get_table_optimizer
*******************

Glue.Client.get_table_optimizer(**kwargs)

   Returns the configuration of all optimizers associated with a
   specified table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_table_optimizer(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          Type='compaction'|'retention'|'orphan_file_deletion'
      )

   Parameters:
      * **CatalogId** (*string*) --

        **[REQUIRED]**

        The Catalog ID of the table.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database in the catalog in which the table
        resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table.

      * **Type** (*string*) --

        **[REQUIRED]**

        The type of table optimizer.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'CatalogId': 'string',
             'DatabaseName': 'string',
             'TableName': 'string',
             'TableOptimizer': {
                 'type': 'compaction'|'retention'|'orphan_file_deletion',
                 'configuration': {
                     'roleArn': 'string',
                     'enabled': True|False,
                     'vpcConfiguration': {
                         'glueConnectionName': 'string'
                     },
                     'compactionConfiguration': {
                         'icebergConfiguration': {
                             'strategy': 'binpack'|'sort'|'z-order',
                             'minInputFiles': 123,
                             'deleteFileThreshold': 123
                         }
                     },
                     'retentionConfiguration': {
                         'icebergConfiguration': {
                             'snapshotRetentionPeriodInDays': 123,
                             'numberOfSnapshotsToRetain': 123,
                             'cleanExpiredFiles': True|False,
                             'runRateInHours': 123
                         }
                     },
                     'orphanFileDeletionConfiguration': {
                         'icebergConfiguration': {
                             'orphanFileRetentionPeriodInDays': 123,
                             'location': 'string',
                             'runRateInHours': 123
                         }
                     }
                 },
                 'lastRun': {
                     'eventType': 'starting'|'completed'|'failed'|'in_progress',
                     'startTimestamp': datetime(2015, 1, 1),
                     'endTimestamp': datetime(2015, 1, 1),
                     'metrics': {
                         'NumberOfBytesCompacted': 'string',
                         'NumberOfFilesCompacted': 'string',
                         'NumberOfDpus': 'string',
                         'JobDurationInHour': 'string'
                     },
                     'error': 'string',
                     'compactionMetrics': {
                         'IcebergMetrics': {
                             'NumberOfBytesCompacted': 123,
                             'NumberOfFilesCompacted': 123,
                             'DpuHours': 123.0,
                             'NumberOfDpus': 123,
                             'JobDurationInHour': 123.0
                         }
                     },
                     'compactionStrategy': 'binpack'|'sort'|'z-order',
                     'retentionMetrics': {
                         'IcebergMetrics': {
                             'NumberOfDataFilesDeleted': 123,
                             'NumberOfManifestFilesDeleted': 123,
                             'NumberOfManifestListsDeleted': 123,
                             'DpuHours': 123.0,
                             'NumberOfDpus': 123,
                             'JobDurationInHour': 123.0
                         }
                     },
                     'orphanFileDeletionMetrics': {
                         'IcebergMetrics': {
                             'NumberOfOrphanFilesDeleted': 123,
                             'DpuHours': 123.0,
                             'NumberOfDpus': 123,
                             'JobDurationInHour': 123.0
                         }
                     }
                 },
                 'configurationSource': 'catalog'|'table'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **CatalogId** *(string) --*

          The Catalog ID of the table.

        * **DatabaseName** *(string) --*

          The name of the database in the catalog in which the table
          resides.

        * **TableName** *(string) --*

          The name of the table.

        * **TableOptimizer** *(dict) --*

          The optimizer associated with the specified table.

          * **type** *(string) --*

            The type of table optimizer. The valid values are:

            * "compaction": for managing compaction with a table
              optimizer.

            * "retention": for managing the retention of snapshot with
              a table optimizer.

            * "orphan_file_deletion": for managing the deletion of
              orphan files with a table optimizer.

          * **configuration** *(dict) --*

            A "TableOptimizerConfiguration" object that was specified
            when creating or updating a table optimizer.

            * **roleArn** *(string) --*

              A role passed by the caller which gives the service
              permission to update the resources associated with the
              optimizer on the caller's behalf.

            * **enabled** *(boolean) --*

              Whether table optimization is enabled.

            * **vpcConfiguration** *(dict) --*

              A "TableOptimizerVpcConfiguration" object representing
              the VPC configuration for a table optimizer.

              This configuration is necessary to perform optimization
              on tables that are in a customer VPC.

              Note:

                This is a Tagged Union structure. Only one of the
                following top level keys will be set:
                "glueConnectionName".     If a client receives an
                unknown member it will     set "SDK_UNKNOWN_MEMBER" as
                the top level key,     which maps to the name or tag
                of the unknown     member. The structure of
                "SDK_UNKNOWN_MEMBER" is     as follows:

                   'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}

              * **glueConnectionName** *(string) --*

                The name of the Glue connection used for the VPC for
                the table optimizer.

            * **compactionConfiguration** *(dict) --*

              The configuration for a compaction optimizer. This
              configuration defines how data files in your table will
              be compacted to improve query performance and reduce
              storage costs.

              * **icebergConfiguration** *(dict) --*

                The configuration for an Iceberg compaction optimizer.

                * **strategy** *(string) --*

                  The strategy to use for compaction. Valid values
                  are:

                  * "binpack": Combines small files into larger files,
                    typically targeting sizes over 100MB, while
                    applying any pending deletes. This is the
                    recommended compaction strategy for most use
                    cases.

                  * "sort": Organizes data based on specified columns
                    which are sorted hierarchically during compaction,
                    improving query performance for filtered
                    operations. This strategy is recommended when your
                    queries frequently filter on specific columns. To
                    use this strategy, you must first define a sort
                    order in your Iceberg table properties using the
                    "sort_order" table property.

                  * "z-order": Optimizes data organization by blending
                    multiple attributes into a single scalar value
                    that can be used for sorting, allowing efficient
                    querying across multiple dimensions. This strategy
                    is recommended when you need to query data across
                    multiple dimensions simultaneously. To use this
                    strategy, you must first define a sort order in
                    your Iceberg table properties using the
                    "sort_order" table property.

                  If an input is not provided, the default value
                  'binpack' will be used.

                * **minInputFiles** *(integer) --*

                  The minimum number of data files that must be
                  present in a partition before compaction will
                  actually compact files. This parameter helps control
                  when compaction is triggered, preventing unnecessary
                  compaction operations on partitions with few files.
                  If an input is not provided, the default value 100
                  will be used.

                * **deleteFileThreshold** *(integer) --*

                  The minimum number of deletes that must be present
                  in a data file to make it eligible for compaction.
                  This parameter helps optimize compaction by focusing
                  on files that contain a significant number of delete
                  operations, which can improve query performance by
                  removing deleted records. If an input is not
                  provided, the default value 1 will be used.

            * **retentionConfiguration** *(dict) --*

              The configuration for a snapshot retention optimizer.

              * **icebergConfiguration** *(dict) --*

                The configuration for an Iceberg snapshot retention
                optimizer.

                * **snapshotRetentionPeriodInDays** *(integer) --*

                  The number of days to retain the Iceberg snapshots.
                  If an input is not provided, the corresponding
                  Iceberg table configuration field will be used or if
                  not present, the default value 5 will be used.

                * **numberOfSnapshotsToRetain** *(integer) --*

                  The number of Iceberg snapshots to retain within the
                  retention period. If an input is not provided, the
                  corresponding Iceberg table configuration field will
                  be used or if not present, the default value 1 will
                  be used.

                * **cleanExpiredFiles** *(boolean) --*

                  If set to false, snapshots are only deleted from
                  table metadata, and the underlying data and metadata
                  files are not deleted.

                * **runRateInHours** *(integer) --*

                  The interval in hours between retention job runs.
                  This parameter controls how frequently the retention
                  optimizer will run to clean up expired snapshots.
                  The value must be between 3 and 168 hours (7 days).
                  If an input is not provided, the default value 24
                  will be used.

            * **orphanFileDeletionConfiguration** *(dict) --*

              The configuration for an orphan file deletion optimizer.

              * **icebergConfiguration** *(dict) --*

                The configuration for an Iceberg orphan file deletion
                optimizer.

                * **orphanFileRetentionPeriodInDays** *(integer) --*

                  The number of days that orphan files should be
                  retained before file deletion. If an input is not
                  provided, the default value 3 will be used.

                * **location** *(string) --*

                  Specifies a directory in which to look for files
                  (defaults to the table's location). You may choose a
                  sub-directory rather than the top-level table
                  location.

                * **runRateInHours** *(integer) --*

                  The interval in hours between orphan file deletion
                  job runs. This parameter controls how frequently the
                  orphan file deletion optimizer will run to clean up
                  orphan files. The value must be between 3 and 168
                  hours (7 days). If an input is not provided, the
                  default value 24 will be used.

          * **lastRun** *(dict) --*

            A "TableOptimizerRun" object representing the last run of
            the table optimizer.

            * **eventType** *(string) --*

              An event type representing the status of the table
              optimizer run.

            * **startTimestamp** *(datetime) --*

              Represents the epoch timestamp at which the compaction
              job was started within Lake Formation.

            * **endTimestamp** *(datetime) --*

              Represents the epoch timestamp at which the compaction
              job ended.

            * **metrics** *(dict) --*

              A "RunMetrics" object containing metrics for the
              optimizer run.

              This member is deprecated. See the individual metric
              members for compaction, retention, and orphan file
              deletion.

              * **NumberOfBytesCompacted** *(string) --*

                The number of bytes removed by the compaction job run.

              * **NumberOfFilesCompacted** *(string) --*

                The number of files removed by the compaction job run.

              * **NumberOfDpus** *(string) --*

                The number of DPUs consumed by the job, rounded up to
                the nearest whole number.

              * **JobDurationInHour** *(string) --*

                The duration of the job in hours.

            * **error** *(string) --*

              An error that occured during the optimizer run.

            * **compactionMetrics** *(dict) --*

              A "CompactionMetrics" object containing metrics for the
              optimizer run.

              * **IcebergMetrics** *(dict) --*

                A structure containing the Iceberg compaction metrics
                for the optimizer run.

                * **NumberOfBytesCompacted** *(integer) --*

                  The number of bytes removed by the compaction job
                  run.

                * **NumberOfFilesCompacted** *(integer) --*

                  The number of files removed by the compaction job
                  run.

                * **DpuHours** *(float) --*

                  The number of DPU hours consumed by the job.

                * **NumberOfDpus** *(integer) --*

                  The number of DPUs consumed by the job, rounded up
                  to the nearest whole number.

                * **JobDurationInHour** *(float) --*

                  The duration of the job in hours.

            * **compactionStrategy** *(string) --*

              The strategy used for the compaction run. Indicates
              which algorithm was applied to determine how files were
              selected and combined during the compaction process.
              Valid values are:

              * "binpack": Combines small files into larger files,
                typically targeting sizes over 100MB, while applying
                any pending deletes. This is the recommended
                compaction strategy for most use cases.

              * "sort": Organizes data based on specified columns
                which are sorted hierarchically during compaction,
                improving query performance for filtered operations.
                This strategy is recommended when your queries
                frequently filter on specific columns. To use this
                strategy, you must first define a sort order in your
                Iceberg table properties using the "sort_order" table
                property.

              * "z-order": Optimizes data organization by blending
                multiple attributes into a single scalar value that
                can be used for sorting, allowing efficient querying
                across multiple dimensions. This strategy is
                recommended when you need to query data across
                multiple dimensions simultaneously. To use this
                strategy, you must first define a sort order in your
                Iceberg table properties using the "sort_order" table
                property.

            * **retentionMetrics** *(dict) --*

              A "RetentionMetrics" object containing metrics for the
              optimizer run.

              * **IcebergMetrics** *(dict) --*

                A structure containing the Iceberg retention metrics
                for the optimizer run.

                * **NumberOfDataFilesDeleted** *(integer) --*

                  The number of data files deleted by the retention
                  job run.

                * **NumberOfManifestFilesDeleted** *(integer) --*

                  The number of manifest files deleted by the
                  retention job run.

                * **NumberOfManifestListsDeleted** *(integer) --*

                  The number of manifest lists deleted by the
                  retention job run.

                * **DpuHours** *(float) --*

                  The number of DPU hours consumed by the job.

                * **NumberOfDpus** *(integer) --*

                  The number of DPUs consumed by the job, rounded up
                  to the nearest whole number.

                * **JobDurationInHour** *(float) --*

                  The duration of the job in hours.

            * **orphanFileDeletionMetrics** *(dict) --*

              An "OrphanFileDeletionMetrics" object containing metrics
              for the optimizer run.

              * **IcebergMetrics** *(dict) --*

                A structure containing the Iceberg orphan file
                deletion metrics for the optimizer run.

                * **NumberOfOrphanFilesDeleted** *(integer) --*

                  The number of orphan files deleted by the orphan
                  file deletion job run.

                * **DpuHours** *(float) --*

                  The number of DPU hours consumed by the job.

                * **NumberOfDpus** *(integer) --*

                  The number of DPUs consumed by the job, rounded up
                  to the nearest whole number.

                * **JobDurationInHour** *(float) --*

                  The duration of the job in hours.

          * **configurationSource** *(string) --*

            Specifies the source of the optimizer configuration. This
            indicates how the table optimizer was configured and which
            entity or service initiated the configuration.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ThrottlingException"
Glue / Client / get_job_run


get_job_run
***********

Glue.Client.get_job_run(**kwargs)

   Retrieves the metadata for a given job run. Job run history is
   accessible for 365 days for your workflow and job run.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_job_run(
          JobName='string',
          RunId='string',
          PredecessorsIncluded=True|False
      )

   Parameters:
      * **JobName** (*string*) --

        **[REQUIRED]**

        Name of the job definition being run.

      * **RunId** (*string*) --

        **[REQUIRED]**

        The ID of the job run.

      * **PredecessorsIncluded** (*boolean*) -- True if a list of
        predecessor runs should be returned.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobRun': {
                 'Id': 'string',
                 'Attempt': 123,
                 'PreviousRunId': 'string',
                 'TriggerName': 'string',
                 'JobName': 'string',
                 'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                 'JobRunQueuingEnabled': True|False,
                 'StartedOn': datetime(2015, 1, 1),
                 'LastModifiedOn': datetime(2015, 1, 1),
                 'CompletedOn': datetime(2015, 1, 1),
                 'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                 'Arguments': {
                     'string': 'string'
                 },
                 'ErrorMessage': 'string',
                 'PredecessorRuns': [
                     {
                         'JobName': 'string',
                         'RunId': 'string'
                     },
                 ],
                 'AllocatedCapacity': 123,
                 'ExecutionTime': 123,
                 'Timeout': 123,
                 'MaxCapacity': 123.0,
                 'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                 'NumberOfWorkers': 123,
                 'SecurityConfiguration': 'string',
                 'LogGroupName': 'string',
                 'NotificationProperty': {
                     'NotifyDelayAfter': 123
                 },
                 'GlueVersion': 'string',
                 'DPUSeconds': 123.0,
                 'ExecutionClass': 'FLEX'|'STANDARD',
                 'MaintenanceWindow': 'string',
                 'ProfileName': 'string',
                 'StateDetail': 'string',
                 'ExecutionRoleSessionPolicy': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **JobRun** *(dict) --*

          The requested job-run metadata.

          * **Id** *(string) --*

            The ID of this job run.

          * **Attempt** *(integer) --*

            The number of the attempt to run this job.

          * **PreviousRunId** *(string) --*

            The ID of the previous run of this job. For example, the
            "JobRunId" specified in the "StartJobRun" action.

          * **TriggerName** *(string) --*

            The name of the trigger that started this job run.

          * **JobName** *(string) --*

            The name of the job definition being used in this run.

          * **JobMode** *(string) --*

            A mode that describes how a job was created. Valid values
            are:

            * "SCRIPT" - The job was created using the Glue Studio
              script editor.

            * "VISUAL" - The job was created using the Glue Studio
              visual editor.

            * "NOTEBOOK" - The job was created using an interactive
              sessions notebook.

            When the "JobMode" field is missing or null, "SCRIPT" is
            assigned as the default value.

          * **JobRunQueuingEnabled** *(boolean) --*

            Specifies whether job run queuing is enabled for the job
            run.

            A value of true means job run queuing is enabled for the
            job run. If false or not populated, the job run will not
            be considered for queueing.

          * **StartedOn** *(datetime) --*

            The date and time at which this job run was started.

          * **LastModifiedOn** *(datetime) --*

            The last time that this job run was modified.

          * **CompletedOn** *(datetime) --*

            The date and time that this job run completed.

          * **JobRunState** *(string) --*

            The current state of the job run. For more information
            about the statuses of jobs that have terminated
            abnormally, see Glue Job Run Statuses.

          * **Arguments** *(dict) --*

            The job arguments associated with this run. For this job
            run, they replace the default arguments set in the job
            definition itself.

            You can specify arguments here that your own job-execution
            script consumes, as well as arguments that Glue itself
            consumes.

            Job arguments may be logged. Do not pass plaintext secrets
            as arguments. Retrieve secrets from a Glue Connection,
            Secrets Manager or other secret management mechanism if
            you intend to keep them within the Job.

            For information about how to specify and consume your own
            Job arguments, see the Calling Glue APIs in Python topic
            in the developer guide.

            For information about the arguments you can provide to
            this field when configuring Spark jobs, see the Special
            Parameters Used by Glue topic in the developer guide.

            For information about the arguments you can provide to
            this field when configuring Ray jobs, see Using job
            parameters in Ray jobs in the developer guide.

            * *(string) --*

              * *(string) --*

          * **ErrorMessage** *(string) --*

            An error message associated with this job run.

          * **PredecessorRuns** *(list) --*

            A list of predecessors to this job run.

            * *(dict) --*

              A job run that was used in the predicate of a
              conditional trigger that triggered this job run.

              * **JobName** *(string) --*

                The name of the job definition used by the predecessor
                job run.

              * **RunId** *(string) --*

                The job-run ID of the predecessor job run.

          * **AllocatedCapacity** *(integer) --*

            This field is deprecated. Use "MaxCapacity" instead.

            The number of Glue data processing units (DPUs) allocated
            to this JobRun. From 2 to 100 DPUs can be allocated; the
            default is 10. A DPU is a relative measure of processing
            power that consists of 4 vCPUs of compute capacity and 16
            GB of memory. For more information, see the Glue pricing
            page.

          * **ExecutionTime** *(integer) --*

            The amount of time (in seconds) that the job run consumed
            resources.

          * **Timeout** *(integer) --*

            The "JobRun" timeout in minutes. This is the maximum time
            that a job run can consume resources before it is
            terminated and enters "TIMEOUT" status. This value
            overrides the timeout value set in the parent job.

            Jobs must have timeout values less than 7 days or 10080
            minutes. Otherwise, the jobs will throw an exception.

            When the value is left blank, the timeout is defaulted to
            2880 minutes.

            Any existing Glue jobs that had a timeout value greater
            than 7 days will be defaulted to 7 days. For instance if
            you have specified a timeout of 20 days for a batch job,
            it will be stopped on the 7th day.

            For streaming jobs, if you have set up a maintenance
            window, it will be restarted during the maintenance window
            after 7 days.

          * **MaxCapacity** *(float) --*

            For Glue version 1.0 or earlier jobs, using the standard
            worker type, the number of Glue data processing units
            (DPUs) that can be allocated when this job runs. A DPU is
            a relative measure of processing power that consists of 4
            vCPUs of compute capacity and 16 GB of memory. For more
            information, see the Glue pricing page.

            For Glue version 2.0+ jobs, you cannot specify a "Maximum
            capacity". Instead, you should specify a "Worker type" and
            the "Number of workers".

            Do not set "MaxCapacity" if using "WorkerType" and
            "NumberOfWorkers".

            The value that can be allocated for "MaxCapacity" depends
            on whether you are running a Python shell job, an Apache
            Spark ETL job, or an Apache Spark streaming ETL job:

            * When you specify a Python shell job (
              >>``<<JobCommand.Name``="pythonshell"), you can allocate
              either 0.0625 or 1 DPU. The default is 0.0625 DPU.

            * When you specify an Apache Spark ETL job (
              >>``<<JobCommand.Name``="glueetl") or Apache Spark
              streaming ETL job (
              >>``<<JobCommand.Name``="gluestreaming"), you can
              allocate from 2 to 100 DPUs. The default is 10 DPUs.
              This job type cannot have a fractional DPU allocation.

          * **WorkerType** *(string) --*

            The type of predefined worker that is allocated when a job
            runs. Accepts a value of G.1X, G.2X, G.4X, G.8X or G.025X
            for Spark jobs. Accepts the value Z.2X for Ray jobs.

            * For the "G.1X" worker type, each worker maps to 1 DPU (4
              vCPUs, 16 GB of memory) with 94GB disk, and provides 1
              executor per worker. We recommend this worker type for
              workloads such as data transforms, joins, and queries,
              to offers a scalable and cost effective way to run most
              jobs.

            * For the "G.2X" worker type, each worker maps to 2 DPU (8
              vCPUs, 32 GB of memory) with 138GB disk, and provides 1
              executor per worker. We recommend this worker type for
              workloads such as data transforms, joins, and queries,
              to offers a scalable and cost effective way to run most
              jobs.

            * For the "G.4X" worker type, each worker maps to 4 DPU
              (16 vCPUs, 64 GB of memory) with 256GB disk, and
              provides 1 executor per worker. We recommend this worker
              type for jobs whose workloads contain your most
              demanding transforms, aggregations, joins, and queries.
              This worker type is available only for Glue version 3.0
              or later Spark ETL jobs in the following Amazon Web
              Services Regions: US East (Ohio), US East (N. Virginia),
              US West (Oregon), Asia Pacific (Singapore), Asia Pacific
              (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe
              (Frankfurt), Europe (Ireland), and Europe (Stockholm).

            * For the "G.8X" worker type, each worker maps to 8 DPU
              (32 vCPUs, 128 GB of memory) with 512GB disk, and
              provides 1 executor per worker. We recommend this worker
              type for jobs whose workloads contain your most
              demanding transforms, aggregations, joins, and queries.
              This worker type is available only for Glue version 3.0
              or later Spark ETL jobs, in the same Amazon Web Services
              Regions as supported for the "G.4X" worker type.

            * For the "G.025X" worker type, each worker maps to 0.25
              DPU (2 vCPUs, 4 GB of memory) with 84GB disk, and
              provides 1 executor per worker. We recommend this worker
              type for low volume streaming jobs. This worker type is
              only available for Glue version 3.0 or later streaming
              jobs.

            * For the "Z.2X" worker type, each worker maps to 2 M-DPU
              (8vCPUs, 64 GB of memory) with 128 GB disk, and provides
              up to 8 Ray workers based on the autoscaler.

          * **NumberOfWorkers** *(integer) --*

            The number of workers of a defined "workerType" that are
            allocated when a job runs.

          * **SecurityConfiguration** *(string) --*

            The name of the "SecurityConfiguration" structure to be
            used with this job run.

          * **LogGroupName** *(string) --*

            The name of the log group for secure logging that can be
            server-side encrypted in Amazon CloudWatch using KMS. This
            name can be "/aws-glue/jobs/", in which case the default
            encryption is "NONE". If you add a role name and
            "SecurityConfiguration" name (in other words, "/aws-glue
            /jobs-yourRoleName-yourSecurityConfigurationName/"), then
            that security configuration is used to encrypt the log
            group.

          * **NotificationProperty** *(dict) --*

            Specifies configuration properties of a job run
            notification.

            * **NotifyDelayAfter** *(integer) --*

              After a job run starts, the number of minutes to wait
              before sending a job run delay notification.

          * **GlueVersion** *(string) --*

            In Spark jobs, "GlueVersion" determines the versions of
            Apache Spark and Python that Glue available in a job. The
            Python version indicates the version supported for jobs of
            type Spark.

            Ray jobs should set "GlueVersion" to "4.0" or greater.
            However, the versions of Ray, Python and additional
            libraries available in your Ray job are determined by the
            "Runtime" parameter of the Job command.

            For more information about the available Glue versions and
            corresponding Spark and Python versions, see Glue version
            in the developer guide.

            Jobs that are created without specifying a Glue version
            default to Glue 0.9.

          * **DPUSeconds** *(float) --*

            This field can be set for either job runs with execution
            class "FLEX" or when Auto Scaling is enabled, and
            represents the total time each executor ran during the
            lifecycle of a job run in seconds, multiplied by a DPU
            factor (1 for "G.1X", 2 for "G.2X", or 0.25 for "G.025X"
            workers). This value may be different than the
            "executionEngineRuntime" * "MaxCapacity" as in the case of
            Auto Scaling jobs, as the number of executors running at a
            given time may be less than the "MaxCapacity". Therefore,
            it is possible that the value of "DPUSeconds" is less than
            "executionEngineRuntime" * "MaxCapacity".

          * **ExecutionClass** *(string) --*

            Indicates whether the job is run with a standard or
            flexible execution class. The standard execution-class is
            ideal for time-sensitive workloads that require fast job
            startup and dedicated resources.

            The flexible execution class is appropriate for time-
            insensitive jobs whose start and completion times may
            vary.

            Only jobs with Glue version 3.0 and above and command type
            "glueetl" will be allowed to set "ExecutionClass" to
            "FLEX". The flexible execution class is available for
            Spark jobs.

          * **MaintenanceWindow** *(string) --*

            This field specifies a day of the week and hour for a
            maintenance window for streaming jobs. Glue periodically
            performs maintenance activities. During these maintenance
            windows, Glue will need to restart your streaming jobs.

            Glue will restart the job within 3 hours of the specified
            maintenance window. For instance, if you set up the
            maintenance window for Monday at 10:00AM GMT, your jobs
            will be restarted between 10:00AM GMT to 1:00PM GMT.

          * **ProfileName** *(string) --*

            The name of an Glue usage profile associated with the job
            run.

          * **StateDetail** *(string) --*

            This field holds details that pertain to the state of a
            job run. The field is nullable.

            For example, when a job run is in a WAITING state as a
            result of job run queuing, the field has the reason why
            the job run is in that state.

          * **ExecutionRoleSessionPolicy** *(string) --*

            This inline session policy to the StartJobRun API allows
            you to dynamically restrict the permissions of the
            specified execution role for the scope of the job, without
            requiring the creation of additional IAM roles.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / modify_integration


modify_integration
******************

Glue.Client.modify_integration(**kwargs)

   Modifies a Zero-ETL integration in the caller's account.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.modify_integration(
          IntegrationIdentifier='string',
          Description='string',
          DataFilter='string',
          IntegrationName='string'
      )

   Parameters:
      * **IntegrationIdentifier** (*string*) --

        **[REQUIRED]**

        The Amazon Resource Name (ARN) for the integration.

      * **Description** (*string*) -- A description of the
        integration.

      * **DataFilter** (*string*) -- Selects source tables for the
        integration using Maxwell filter syntax.

      * **IntegrationName** (*string*) -- A unique name for an
        integration in Glue.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SourceArn': 'string',
             'TargetArn': 'string',
             'IntegrationName': 'string',
             'Description': 'string',
             'IntegrationArn': 'string',
             'KmsKeyId': 'string',
             'AdditionalEncryptionContext': {
                 'string': 'string'
             },
             'Tags': [
                 {
                     'key': 'string',
                     'value': 'string'
                 },
             ],
             'Status': 'CREATING'|'ACTIVE'|'MODIFYING'|'FAILED'|'DELETING'|'SYNCING'|'NEEDS_ATTENTION',
             'CreateTime': datetime(2015, 1, 1),
             'Errors': [
                 {
                     'ErrorCode': 'string',
                     'ErrorMessage': 'string'
                 },
             ],
             'DataFilter': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **SourceArn** *(string) --*

          The ARN of the source for the integration.

        * **TargetArn** *(string) --*

          The ARN of the target for the integration.

        * **IntegrationName** *(string) --*

          A unique name for an integration in Glue.

        * **Description** *(string) --*

          A description of the integration.

        * **IntegrationArn** *(string) --*

          The Amazon Resource Name (ARN) for the integration.

        * **KmsKeyId** *(string) --*

          The ARN of a KMS key used for encrypting the channel.

        * **AdditionalEncryptionContext** *(dict) --*

          An optional set of non-secret key–value pairs that contains
          additional contextual information for encryption.

          * *(string) --*

            * *(string) --*

        * **Tags** *(list) --*

          Metadata assigned to the resource consisting of a list of
          key-value pairs.

          * *(dict) --*

            The "Tag" object represents a label that you can assign to
            an Amazon Web Services resource. Each tag consists of a
            key and an optional value, both of which you define.

            For more information about tags, and controlling access to
            resources in Glue, see Amazon Web Services Tags in Glue
            and Specifying Glue Resource ARNs in the developer guide.

            * **key** *(string) --*

              The tag key. The key is required when you create a tag
              on an object. The key is case-sensitive, and must not
              contain the prefix aws.

            * **value** *(string) --*

              The tag value. The value is optional when you create a
              tag on an object. The value is case-sensitive, and must
              not contain the prefix aws.

        * **Status** *(string) --*

          The status of the integration being modified.

          The possible statuses are:

          * CREATING: The integration is being created.

          * ACTIVE: The integration creation succeeds.

          * MODIFYING: The integration is being modified.

          * FAILED: The integration creation fails.

          * DELETING: The integration is deleted.

          * SYNCING: The integration is synchronizing.

          * NEEDS_ATTENTION: The integration needs attention, such as
            synchronization.

        * **CreateTime** *(datetime) --*

          The time when the integration was created, in UTC.

        * **Errors** *(list) --*

          A list of errors associated with the integration
          modification.

          * *(dict) --*

            An error associated with a zero-ETL integration.

            * **ErrorCode** *(string) --*

              The code associated with this error.

            * **ErrorMessage** *(string) --*

              A message describing the error.

        * **DataFilter** *(string) --*

          Selects source tables for the integration using Maxwell
          filter syntax.

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.IntegrationNotFoundFault"

   * "Glue.Client.exceptions.IntegrationConflictOperationFault"

   * "Glue.Client.exceptions.InvalidIntegrationStateFault"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ConflictException"

   * "Glue.Client.exceptions.InvalidStateException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / get_column_statistics_task_settings


get_column_statistics_task_settings
***********************************

Glue.Client.get_column_statistics_task_settings(**kwargs)

   Gets settings for a column statistics task.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_column_statistics_task_settings(
          DatabaseName='string',
          TableName='string'
      )

   Parameters:
      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database where the table resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table for which to retrieve column statistics.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ColumnStatisticsTaskSettings': {
                 'DatabaseName': 'string',
                 'TableName': 'string',
                 'Schedule': {
                     'ScheduleExpression': 'string',
                     'State': 'SCHEDULED'|'NOT_SCHEDULED'|'TRANSITIONING'
                 },
                 'ColumnNameList': [
                     'string',
                 ],
                 'CatalogID': 'string',
                 'Role': 'string',
                 'SampleSize': 123.0,
                 'SecurityConfiguration': 'string',
                 'ScheduleType': 'CRON'|'AUTO',
                 'SettingSource': 'CATALOG'|'TABLE',
                 'LastExecutionAttempt': {
                     'Status': 'FAILED'|'STARTED',
                     'ColumnStatisticsTaskRunId': 'string',
                     'ExecutionTimestamp': datetime(2015, 1, 1),
                     'ErrorMessage': 'string'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **ColumnStatisticsTaskSettings** *(dict) --*

          A "ColumnStatisticsTaskSettings" object representing the
          settings for the column statistics task.

          * **DatabaseName** *(string) --*

            The name of the database where the table resides.

          * **TableName** *(string) --*

            The name of the table for which to generate column
            statistics.

          * **Schedule** *(dict) --*

            A schedule for running the column statistics, specified in
            CRON syntax.

            * **ScheduleExpression** *(string) --*

              A "cron" expression used to specify the schedule (see
              Time-Based Schedules for Jobs and Crawlers. For example,
              to run something every day at 12:15 UTC, you would
              specify: "cron(15 12 * * ? *)".

            * **State** *(string) --*

              The state of the schedule.

          * **ColumnNameList** *(list) --*

            A list of column names for which to run statistics.

            * *(string) --*

          * **CatalogID** *(string) --*

            The ID of the Data Catalog in which the database resides.

          * **Role** *(string) --*

            The role used for running the column statistics.

          * **SampleSize** *(float) --*

            The percentage of data to sample.

          * **SecurityConfiguration** *(string) --*

            Name of the security configuration that is used to encrypt
            CloudWatch logs.

          * **ScheduleType** *(string) --*

            The type of schedule for a column statistics task.
            Possible values may be "CRON" or "AUTO".

          * **SettingSource** *(string) --*

            The source of setting the column statistics task. Possible
            values may be "CATALOG" or "TABLE".

          * **LastExecutionAttempt** *(dict) --*

            The last "ExecutionAttempt" for the column statistics task
            run.

            * **Status** *(string) --*

              The status of the last column statistics task run.

            * **ColumnStatisticsTaskRunId** *(string) --*

              A task run ID for the last column statistics task run.

            * **ExecutionTimestamp** *(datetime) --*

              A timestamp when the last column statistics task run
              occurred.

            * **ErrorMessage** *(string) --*

              An error message associated with the last column
              statistics task run.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / delete_database


delete_database
***************

Glue.Client.delete_database(**kwargs)

   Removes a specified database from a Data Catalog.

   Note:

     After completing this operation, you no longer have access to the
     tables (and all table versions and partitions that might belong
     to the tables) and the user-defined functions in the deleted
     database. Glue deletes these "orphaned" resources asynchronously
     in a timely manner, at the discretion of the service.To ensure
     the immediate deletion of all related resources, before calling
     "DeleteDatabase", use "DeleteTableVersion" or
     "BatchDeleteTableVersion", "DeletePartition" or
     "BatchDeletePartition", "DeleteUserDefinedFunction", and
     "DeleteTable" or "BatchDeleteTable", to delete any resources that
     belong to the database.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_database(
          CatalogId='string',
          Name='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which the database resides. If none is provided, the Amazon
        Web Services account ID is used by default.

      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the database to delete. For Hive compatibility,
        this must be all lowercase.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / get_blueprint_runs


get_blueprint_runs
******************

Glue.Client.get_blueprint_runs(**kwargs)

   Retrieves the details of blueprint runs for a specified blueprint.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_blueprint_runs(
          BlueprintName='string',
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **BlueprintName** (*string*) --

        **[REQUIRED]**

        The name of the blueprint.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'BlueprintRuns': [
                 {
                     'BlueprintName': 'string',
                     'RunId': 'string',
                     'WorkflowName': 'string',
                     'State': 'RUNNING'|'SUCCEEDED'|'FAILED'|'ROLLING_BACK',
                     'StartedOn': datetime(2015, 1, 1),
                     'CompletedOn': datetime(2015, 1, 1),
                     'ErrorMessage': 'string',
                     'RollbackErrorMessage': 'string',
                     'Parameters': 'string',
                     'RoleArn': 'string'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **BlueprintRuns** *(list) --*

          Returns a list of "BlueprintRun" objects.

          * *(dict) --*

            The details of a blueprint run.

            * **BlueprintName** *(string) --*

              The name of the blueprint.

            * **RunId** *(string) --*

              The run ID for this blueprint run.

            * **WorkflowName** *(string) --*

              The name of a workflow that is created as a result of a
              successful blueprint run. If a blueprint run has an
              error, there will not be a workflow created.

            * **State** *(string) --*

              The state of the blueprint run. Possible values are:

              * Running — The blueprint run is in progress.

              * Succeeded — The blueprint run completed successfully.

              * Failed — The blueprint run failed and rollback is
                complete.

              * Rolling Back — The blueprint run failed and rollback
                is in progress.

            * **StartedOn** *(datetime) --*

              The date and time that the blueprint run started.

            * **CompletedOn** *(datetime) --*

              The date and time that the blueprint run completed.

            * **ErrorMessage** *(string) --*

              Indicates any errors that are seen while running the
              blueprint.

            * **RollbackErrorMessage** *(string) --*

              If there are any errors while creating the entities of a
              workflow, we try to roll back the created entities until
              that point and delete them. This attribute indicates the
              errors seen while trying to delete the entities that are
              created.

            * **Parameters** *(string) --*

              The blueprint parameters as a string. You will have to
              provide a value for each key that is required from the
              parameter spec that is defined in the
              "Blueprint$ParameterSpec".

            * **RoleArn** *(string) --*

              The role ARN. This role will be assumed by the Glue
              service and will be used to create the workflow and
              other entities of a workflow.

        * **NextToken** *(string) --*

          A continuation token, if not all blueprint runs have been
          returned.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / delete_dev_endpoint


delete_dev_endpoint
*******************

Glue.Client.delete_dev_endpoint(**kwargs)

   Deletes a specified development endpoint.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_dev_endpoint(
          EndpointName='string'
      )

   Parameters:
      **EndpointName** (*string*) --

      **[REQUIRED]**

      The name of the "DevEndpoint".

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / create_ml_transform


create_ml_transform
*******************

Glue.Client.create_ml_transform(**kwargs)

   Creates an Glue machine learning transform. This operation creates
   the transform and all the necessary parameters to train it.

   Call this operation as the first step in the process of using a
   machine learning transform (such as the "FindMatches" transform)
   for deduplicating data. You can provide an optional "Description",
   in addition to the parameters that you want to use for your
   algorithm.

   You must also specify certain parameters for the tasks that Glue
   runs on your behalf as part of learning from your data and creating
   a high-quality machine learning transform. These parameters include
   "Role", and optionally, "AllocatedCapacity", "Timeout", and
   "MaxRetries". For more information, see Jobs.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_ml_transform(
          Name='string',
          Description='string',
          InputRecordTables=[
              {
                  'DatabaseName': 'string',
                  'TableName': 'string',
                  'CatalogId': 'string',
                  'ConnectionName': 'string',
                  'AdditionalOptions': {
                      'string': 'string'
                  }
              },
          ],
          Parameters={
              'TransformType': 'FIND_MATCHES',
              'FindMatchesParameters': {
                  'PrimaryKeyColumnName': 'string',
                  'PrecisionRecallTradeoff': 123.0,
                  'AccuracyCostTradeoff': 123.0,
                  'EnforceProvidedLabels': True|False
              }
          },
          Role='string',
          GlueVersion='string',
          MaxCapacity=123.0,
          WorkerType='Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
          NumberOfWorkers=123,
          Timeout=123,
          MaxRetries=123,
          Tags={
              'string': 'string'
          },
          TransformEncryption={
              'MlUserDataEncryption': {
                  'MlUserDataEncryptionMode': 'DISABLED'|'SSE-KMS',
                  'KmsKeyId': 'string'
              },
              'TaskRunSecurityConfigurationName': 'string'
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The unique name that you give the transform when you create
        it.

      * **Description** (*string*) -- A description of the machine
        learning transform that is being defined. The default is an
        empty string.

      * **InputRecordTables** (*list*) --

        **[REQUIRED]**

        A list of Glue table definitions used by the transform.

        * *(dict) --*

          The database and table in the Glue Data Catalog that is used
          for input or output data.

          * **DatabaseName** *(string) --* **[REQUIRED]**

            A database name in the Glue Data Catalog.

          * **TableName** *(string) --* **[REQUIRED]**

            A table name in the Glue Data Catalog.

          * **CatalogId** *(string) --*

            A unique identifier for the Glue Data Catalog.

          * **ConnectionName** *(string) --*

            The name of the connection to the Glue Data Catalog.

          * **AdditionalOptions** *(dict) --*

            Additional options for the table. Currently there are two
            keys supported:

            * "pushDownPredicate": to filter on partitions without
              having to list and read all the files in your dataset.

            * "catalogPartitionPredicate": to use server-side
              partition pruning using partition indexes in the Glue
              Data Catalog.

            * *(string) --*

              * *(string) --*

      * **Parameters** (*dict*) --

        **[REQUIRED]**

        The algorithmic parameters that are specific to the transform
        type used. Conditionally dependent on the transform type.

        * **TransformType** *(string) --* **[REQUIRED]**

          The type of machine learning transform.

          For information about the types of machine learning
          transforms, see Creating Machine Learning Transforms.

        * **FindMatchesParameters** *(dict) --*

          The parameters for the find matches algorithm.

          * **PrimaryKeyColumnName** *(string) --*

            The name of a column that uniquely identifies rows in the
            source table. Used to help identify matching records.

          * **PrecisionRecallTradeoff** *(float) --*

            The value selected when tuning your transform for a
            balance between precision and recall. A value of 0.5 means
            no preference; a value of 1.0 means a bias purely for
            precision, and a value of 0.0 means a bias for recall.
            Because this is a tradeoff, choosing values close to 1.0
            means very low recall, and choosing values close to 0.0
            results in very low precision.

            The precision metric indicates how often your model is
            correct when it predicts a match.

            The recall metric indicates that for an actual match, how
            often your model predicts the match.

          * **AccuracyCostTradeoff** *(float) --*

            The value that is selected when tuning your transform for
            a balance between accuracy and cost. A value of 0.5 means
            that the system balances accuracy and cost concerns. A
            value of 1.0 means a bias purely for accuracy, which
            typically results in a higher cost, sometimes
            substantially higher. A value of 0.0 means a bias purely
            for cost, which results in a less accurate "FindMatches"
            transform, sometimes with unacceptable accuracy.

            Accuracy measures how well the transform finds true
            positives and true negatives. Increasing accuracy requires
            more machine resources and cost. But it also results in
            increased recall.

            Cost measures how many compute resources, and thus money,
            are consumed to run the transform.

          * **EnforceProvidedLabels** *(boolean) --*

            The value to switch on or off to force the output to match
            the provided labels from users. If the value is "True",
            the "find matches" transform forces the output to match
            the provided labels. The results override the normal
            conflation results. If the value is "False", the "find
            matches" transform does not ensure all the labels provided
            are respected, and the results rely on the trained model.

            Note that setting this value to true may increase the
            conflation execution time.

      * **Role** (*string*) --

        **[REQUIRED]**

        The name or Amazon Resource Name (ARN) of the IAM role with
        the required permissions. The required permissions include
        both Glue service role permissions to Glue resources, and
        Amazon S3 permissions required by the transform.

        * This role needs Glue service role permissions to allow
          access to resources in Glue. See Attach a Policy to IAM
          Users That Access Glue.

        * This role needs permission to your Amazon Simple Storage
          Service (Amazon S3) sources, targets, temporary directory,
          scripts, and any libraries used by the task run for this
          transform.

      * **GlueVersion** (*string*) -- This value determines which
        version of Glue this machine learning transform is compatible
        with. Glue 1.0 is recommended for most customers. If the value
        is not set, the Glue compatibility defaults to Glue 0.9. For
        more information, see Glue Versions in the developer guide.

      * **MaxCapacity** (*float*) --

        The number of Glue data processing units (DPUs) that are
        allocated to task runs for this transform. You can allocate
        from 2 to 100 DPUs; the default is 10. A DPU is a relative
        measure of processing power that consists of 4 vCPUs of
        compute capacity and 16 GB of memory. For more information,
        see the Glue pricing page.

        "MaxCapacity" is a mutually exclusive option with
        "NumberOfWorkers" and "WorkerType".

        * If either "NumberOfWorkers" or "WorkerType" is set, then
          "MaxCapacity" cannot be set.

        * If "MaxCapacity" is set then neither "NumberOfWorkers" or
          "WorkerType" can be set.

        * If "WorkerType" is set, then "NumberOfWorkers" is required
          (and vice versa).

        * "MaxCapacity" and "NumberOfWorkers" must both be at least 1.

        When the "WorkerType" field is set to a value other than
        "Standard", the "MaxCapacity" field is set automatically and
        becomes read-only.

        When the "WorkerType" field is set to a value other than
        "Standard", the "MaxCapacity" field is set automatically and
        becomes read-only.

      * **WorkerType** (*string*) --

        The type of predefined worker that is allocated when this task
        runs. Accepts a value of Standard, G.1X, or G.2X.

        * For the "Standard" worker type, each worker provides 4 vCPU,
          16 GB of memory and a 50GB disk, and 2 executors per worker.

        * For the "G.1X" worker type, each worker provides 4 vCPU, 16
          GB of memory and a 64GB disk, and 1 executor per worker.

        * For the "G.2X" worker type, each worker provides 8 vCPU, 32
          GB of memory and a 128GB disk, and 1 executor per worker.

        "MaxCapacity" is a mutually exclusive option with
        "NumberOfWorkers" and "WorkerType".

        * If either "NumberOfWorkers" or "WorkerType" is set, then
          "MaxCapacity" cannot be set.

        * If "MaxCapacity" is set then neither "NumberOfWorkers" or
          "WorkerType" can be set.

        * If "WorkerType" is set, then "NumberOfWorkers" is required
          (and vice versa).

        * "MaxCapacity" and "NumberOfWorkers" must both be at least 1.

      * **NumberOfWorkers** (*integer*) --

        The number of workers of a defined "workerType" that are
        allocated when this task runs.

        If "WorkerType" is set, then "NumberOfWorkers" is required
        (and vice versa).

      * **Timeout** (*integer*) -- The timeout of the task run for
        this transform in minutes. This is the maximum time that a
        task run for this transform can consume resources before it is
        terminated and enters "TIMEOUT" status. The default is 2,880
        minutes (48 hours).

      * **MaxRetries** (*integer*) -- The maximum number of times to
        retry a task for this transform after a task run fails.

      * **Tags** (*dict*) --

        The tags to use with this machine learning transform. You may
        use tags to limit access to the machine learning transform.
        For more information about tags in Glue, see Amazon Web
        Services Tags in Glue in the developer guide.

        * *(string) --*

          * *(string) --*

      * **TransformEncryption** (*dict*) --

        The encryption-at-rest settings of the transform that apply to
        accessing user data. Machine learning transforms can access
        user data encrypted in Amazon S3 using KMS.

        * **MlUserDataEncryption** *(dict) --*

          An "MLUserDataEncryption" object containing the encryption
          mode and customer-provided KMS key ID.

          * **MlUserDataEncryptionMode** *(string) --* **[REQUIRED]**

            The encryption mode applied to user data. Valid values
            are:

            * DISABLED: encryption is disabled

            * SSEKMS: use of server-side encryption with Key
              Management Service (SSE-KMS) for user data stored in
              Amazon S3.

          * **KmsKeyId** *(string) --*

            The ID for the customer-provided KMS key.

        * **TaskRunSecurityConfigurationName** *(string) --*

          The name of the security configuration.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TransformId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TransformId** *(string) --*

          A unique identifier that is generated for the transform.

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.IdempotentParameterMismatchException"
Glue / Client / create_schema


create_schema
*************

Glue.Client.create_schema(**kwargs)

   Creates a new schema set and registers the schema definition.
   Returns an error if the schema set already exists without actually
   registering the version.

   When the schema set is created, a version checkpoint will be set to
   the first version. Compatibility mode "DISABLED" restricts any
   additional schema versions from being added after the first schema
   version. For all other compatibility modes, validation of
   compatibility settings will be applied only from the second version
   onwards when the "RegisterSchemaVersion" API is used.

   When this API is called without a "RegistryId", this will create an
   entry for a "default-registry" in the registry database tables, if
   it is not already present.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_schema(
          RegistryId={
              'RegistryName': 'string',
              'RegistryArn': 'string'
          },
          SchemaName='string',
          DataFormat='AVRO'|'JSON'|'PROTOBUF',
          Compatibility='NONE'|'DISABLED'|'BACKWARD'|'BACKWARD_ALL'|'FORWARD'|'FORWARD_ALL'|'FULL'|'FULL_ALL',
          Description='string',
          Tags={
              'string': 'string'
          },
          SchemaDefinition='string'
      )

   Parameters:
      * **RegistryId** (*dict*) --

        This is a wrapper shape to contain the registry identity
        fields. If this is not provided, the default registry will be
        used. The ARN format for the same will be: "arn:aws:glue:us-
        east-2:<customer id>:registry/default-registry:random-5
        -letter-id".

        * **RegistryName** *(string) --*

          Name of the registry. Used only for lookup. One of
          "RegistryArn" or "RegistryName" has to be provided.

        * **RegistryArn** *(string) --*

          Arn of the registry to be updated. One of "RegistryArn" or
          "RegistryName" has to be provided.

      * **SchemaName** (*string*) --

        **[REQUIRED]**

        Name of the schema to be created of max length of 255, and may
        only contain letters, numbers, hyphen, underscore, dollar
        sign, or hash mark. No whitespace.

      * **DataFormat** (*string*) --

        **[REQUIRED]**

        The data format of the schema definition. Currently "AVRO",
        "JSON" and "PROTOBUF" are supported.

      * **Compatibility** (*string*) --

        The compatibility mode of the schema. The possible values are:

        * *NONE*: No compatibility mode applies. You can use this
          choice in development scenarios or if you do not know the
          compatibility mode that you want to apply to schemas. Any
          new version added will be accepted without undergoing a
          compatibility check.

        * *DISABLED*: This compatibility choice prevents versioning
          for a particular schema. You can use this choice to prevent
          future versioning of a schema.

        * *BACKWARD*: This compatibility choice is recommended as it
          allows data receivers to read both the current and one
          previous schema version. This means that for instance, a new
          schema version cannot drop data fields or change the type of
          these fields, so they can't be read by readers using the
          previous version.

        * *BACKWARD_ALL*: This compatibility choice allows data
          receivers to read both the current and all previous schema
          versions. You can use this choice when you need to delete
          fields or add optional fields, and check compatibility
          against all previous schema versions.

        * *FORWARD*: This compatibility choice allows data receivers
          to read both the current and one next schema version, but
          not necessarily later versions. You can use this choice when
          you need to add fields or delete optional fields, but only
          check compatibility against the last schema version.

        * *FORWARD_ALL*: This compatibility choice allows data
          receivers to read written by producers of any new registered
          schema. You can use this choice when you need to add fields
          or delete optional fields, and check compatibility against
          all previous schema versions.

        * *FULL*: This compatibility choice allows data receivers to
          read data written by producers using the previous or next
          version of the schema, but not necessarily earlier or later
          versions. You can use this choice when you need to add or
          remove optional fields, but only check compatibility against
          the last schema version.

        * *FULL_ALL*: This compatibility choice allows data receivers
          to read data written by producers using all previous schema
          versions. You can use this choice when you need to add or
          remove optional fields, and check compatibility against all
          previous schema versions.

      * **Description** (*string*) -- An optional description of the
        schema. If description is not provided, there will not be any
        automatic default value for this.

      * **Tags** (*dict*) --

        Amazon Web Services tags that contain a key value pair and may
        be searched by console, command line, or API. If specified,
        follows the Amazon Web Services tags-on-create pattern.

        * *(string) --*

          * *(string) --*

      * **SchemaDefinition** (*string*) -- The schema definition using
        the "DataFormat" setting for "SchemaName".

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RegistryName': 'string',
             'RegistryArn': 'string',
             'SchemaName': 'string',
             'SchemaArn': 'string',
             'Description': 'string',
             'DataFormat': 'AVRO'|'JSON'|'PROTOBUF',
             'Compatibility': 'NONE'|'DISABLED'|'BACKWARD'|'BACKWARD_ALL'|'FORWARD'|'FORWARD_ALL'|'FULL'|'FULL_ALL',
             'SchemaCheckpoint': 123,
             'LatestSchemaVersion': 123,
             'NextSchemaVersion': 123,
             'SchemaStatus': 'AVAILABLE'|'PENDING'|'DELETING',
             'Tags': {
                 'string': 'string'
             },
             'SchemaVersionId': 'string',
             'SchemaVersionStatus': 'AVAILABLE'|'PENDING'|'FAILURE'|'DELETING'
         }

      **Response Structure**

      * *(dict) --*

        * **RegistryName** *(string) --*

          The name of the registry.

        * **RegistryArn** *(string) --*

          The Amazon Resource Name (ARN) of the registry.

        * **SchemaName** *(string) --*

          The name of the schema.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema.

        * **Description** *(string) --*

          A description of the schema if specified when created.

        * **DataFormat** *(string) --*

          The data format of the schema definition. Currently "AVRO",
          "JSON" and "PROTOBUF" are supported.

        * **Compatibility** *(string) --*

          The schema compatibility mode.

        * **SchemaCheckpoint** *(integer) --*

          The version number of the checkpoint (the last time the
          compatibility mode was changed).

        * **LatestSchemaVersion** *(integer) --*

          The latest version of the schema associated with the
          returned schema definition.

        * **NextSchemaVersion** *(integer) --*

          The next version of the schema associated with the returned
          schema definition.

        * **SchemaStatus** *(string) --*

          The status of the schema.

        * **Tags** *(dict) --*

          The tags for the schema.

          * *(string) --*

            * *(string) --*

        * **SchemaVersionId** *(string) --*

          The unique identifier of the first schema version.

        * **SchemaVersionStatus** *(string) --*

          The status of the first schema version created.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / list_registries


list_registries
***************

Glue.Client.list_registries(**kwargs)

   Returns a list of registries that you have created, with minimal
   registry information. Registries in the "Deleting" status will not
   be included in the results. Empty results will be returned if there
   are no registries available.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_registries(
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **MaxResults** (*integer*) -- Maximum number of results
        required per page. If the value is not supplied, this will be
        defaulted to 25 per page.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Registries': [
                 {
                     'RegistryName': 'string',
                     'RegistryArn': 'string',
                     'Description': 'string',
                     'Status': 'AVAILABLE'|'DELETING',
                     'CreatedTime': 'string',
                     'UpdatedTime': 'string'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Registries** *(list) --*

          An array of "RegistryDetailedListItem" objects containing
          minimal details of each registry.

          * *(dict) --*

            A structure containing the details for a registry.

            * **RegistryName** *(string) --*

              The name of the registry.

            * **RegistryArn** *(string) --*

              The Amazon Resource Name (ARN) of the registry.

            * **Description** *(string) --*

              A description of the registry.

            * **Status** *(string) --*

              The status of the registry.

            * **CreatedTime** *(string) --*

              The data the registry was created.

            * **UpdatedTime** *(string) --*

              The date the registry was updated.

        * **NextToken** *(string) --*

          A continuation token for paginating the returned list of
          tokens, returned if the current segment of the list is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / get_session


get_session
***********

Glue.Client.get_session(**kwargs)

   Retrieves the session.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_session(
          Id='string',
          RequestOrigin='string'
      )

   Parameters:
      * **Id** (*string*) --

        **[REQUIRED]**

        The ID of the session.

      * **RequestOrigin** (*string*) -- The origin of the request.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Session': {
                 'Id': 'string',
                 'CreatedOn': datetime(2015, 1, 1),
                 'Status': 'PROVISIONING'|'READY'|'FAILED'|'TIMEOUT'|'STOPPING'|'STOPPED',
                 'ErrorMessage': 'string',
                 'Description': 'string',
                 'Role': 'string',
                 'Command': {
                     'Name': 'string',
                     'PythonVersion': 'string'
                 },
                 'DefaultArguments': {
                     'string': 'string'
                 },
                 'Connections': {
                     'Connections': [
                         'string',
                     ]
                 },
                 'Progress': 123.0,
                 'MaxCapacity': 123.0,
                 'SecurityConfiguration': 'string',
                 'GlueVersion': 'string',
                 'NumberOfWorkers': 123,
                 'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                 'CompletedOn': datetime(2015, 1, 1),
                 'ExecutionTime': 123.0,
                 'DPUSeconds': 123.0,
                 'IdleTimeout': 123,
                 'ProfileName': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Session** *(dict) --*

          The session object is returned in the response.

          * **Id** *(string) --*

            The ID of the session.

          * **CreatedOn** *(datetime) --*

            The time and date when the session was created.

          * **Status** *(string) --*

            The session status.

          * **ErrorMessage** *(string) --*

            The error message displayed during the session.

          * **Description** *(string) --*

            The description of the session.

          * **Role** *(string) --*

            The name or Amazon Resource Name (ARN) of the IAM role
            associated with the Session.

          * **Command** *(dict) --*

            The command object.See SessionCommand.

            * **Name** *(string) --*

              Specifies the name of the SessionCommand. Can be
              'glueetl' or 'gluestreaming'.

            * **PythonVersion** *(string) --*

              Specifies the Python version. The Python version
              indicates the version supported for jobs of type Spark.

          * **DefaultArguments** *(dict) --*

            A map array of key-value pairs. Max is 75 pairs.

            * *(string) --*

              * *(string) --*

          * **Connections** *(dict) --*

            The number of connections used for the session.

            * **Connections** *(list) --*

              A list of connections used by the job.

              * *(string) --*

          * **Progress** *(float) --*

            The code execution progress of the session.

          * **MaxCapacity** *(float) --*

            The number of Glue data processing units (DPUs) that can
            be allocated when the job runs. A DPU is a relative
            measure of processing power that consists of 4 vCPUs of
            compute capacity and 16 GB memory.

          * **SecurityConfiguration** *(string) --*

            The name of the SecurityConfiguration structure to be used
            with the session.

          * **GlueVersion** *(string) --*

            The Glue version determines the versions of Apache Spark
            and Python that Glue supports. The GlueVersion must be
            greater than 2.0.

          * **NumberOfWorkers** *(integer) --*

            The number of workers of a defined "WorkerType" to use for
            the session.

          * **WorkerType** *(string) --*

            The type of predefined worker that is allocated when a
            session runs. Accepts a value of "G.1X", "G.2X", "G.4X",
            or "G.8X" for Spark sessions. Accepts the value "Z.2X" for
            Ray sessions.

          * **CompletedOn** *(datetime) --*

            The date and time that this session is completed.

          * **ExecutionTime** *(float) --*

            The total time the session ran for.

          * **DPUSeconds** *(float) --*

            The DPUs consumed by the session (formula: ExecutionTime *
            MaxCapacity).

          * **IdleTimeout** *(integer) --*

            The number of minutes when idle before the session times
            out.

          * **ProfileName** *(string) --*

            The name of an Glue usage profile associated with the
            session.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / get_user_defined_functions


get_user_defined_functions
**************************

Glue.Client.get_user_defined_functions(**kwargs)

   Retrieves multiple function definitions from the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_user_defined_functions(
          CatalogId='string',
          DatabaseName='string',
          Pattern='string',
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the functions to be retrieved are located. If none is
        provided, the Amazon Web Services account ID is used by
        default.

      * **DatabaseName** (*string*) -- The name of the catalog
        database where the functions are located. If none is provided,
        functions from all the databases across the catalog will be
        returned.

      * **Pattern** (*string*) --

        **[REQUIRED]**

        An optional function-name pattern string that filters the
        function definitions returned.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

      * **MaxResults** (*integer*) -- The maximum number of functions
        to return in one response.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'UserDefinedFunctions': [
                 {
                     'FunctionName': 'string',
                     'DatabaseName': 'string',
                     'ClassName': 'string',
                     'OwnerName': 'string',
                     'OwnerType': 'USER'|'ROLE'|'GROUP',
                     'CreateTime': datetime(2015, 1, 1),
                     'ResourceUris': [
                         {
                             'ResourceType': 'JAR'|'FILE'|'ARCHIVE',
                             'Uri': 'string'
                         },
                     ],
                     'CatalogId': 'string'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **UserDefinedFunctions** *(list) --*

          A list of requested function definitions.

          * *(dict) --*

            Represents the equivalent of a Hive user-defined function
            ( "UDF") definition.

            * **FunctionName** *(string) --*

              The name of the function.

            * **DatabaseName** *(string) --*

              The name of the catalog database that contains the
              function.

            * **ClassName** *(string) --*

              The Java class that contains the function code.

            * **OwnerName** *(string) --*

              The owner of the function.

            * **OwnerType** *(string) --*

              The owner type.

            * **CreateTime** *(datetime) --*

              The time at which the function was created.

            * **ResourceUris** *(list) --*

              The resource URIs for the function.

              * *(dict) --*

                The URIs for function resources.

                * **ResourceType** *(string) --*

                  The type of the resource.

                * **Uri** *(string) --*

                  The URI for accessing the resource.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the function
              resides.

        * **NextToken** *(string) --*

          A continuation token, if the list of functions returned does
          not include the last requested function.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_custom_entity_type


get_custom_entity_type
**********************

Glue.Client.get_custom_entity_type(**kwargs)

   Retrieves the details of a custom pattern by specifying its name.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_custom_entity_type(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the custom pattern that you want to retrieve.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string',
             'RegexString': 'string',
             'ContextWords': [
                 'string',
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the custom pattern that you retrieved.

        * **RegexString** *(string) --*

          A regular expression string that is used for detecting
          sensitive data in a custom pattern.

        * **ContextWords** *(list) --*

          A list of context words if specified when you created the
          custom pattern. If none of these context words are found
          within the vicinity of the regular expression the data will
          not be detected as sensitive data.

          * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_ml_transform


get_ml_transform
****************

Glue.Client.get_ml_transform(**kwargs)

   Gets an Glue machine learning transform artifact and all its
   corresponding metadata. Machine learning transforms are a special
   type of transform that use machine learning to learn the details of
   the transformation to be performed by learning from examples
   provided by humans. These transformations are then saved by Glue.
   You can retrieve their metadata by calling "GetMLTransform".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_ml_transform(
          TransformId='string'
      )

   Parameters:
      **TransformId** (*string*) --

      **[REQUIRED]**

      The unique identifier of the transform, generated at the time
      that the transform was created.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TransformId': 'string',
             'Name': 'string',
             'Description': 'string',
             'Status': 'NOT_READY'|'READY'|'DELETING',
             'CreatedOn': datetime(2015, 1, 1),
             'LastModifiedOn': datetime(2015, 1, 1),
             'InputRecordTables': [
                 {
                     'DatabaseName': 'string',
                     'TableName': 'string',
                     'CatalogId': 'string',
                     'ConnectionName': 'string',
                     'AdditionalOptions': {
                         'string': 'string'
                     }
                 },
             ],
             'Parameters': {
                 'TransformType': 'FIND_MATCHES',
                 'FindMatchesParameters': {
                     'PrimaryKeyColumnName': 'string',
                     'PrecisionRecallTradeoff': 123.0,
                     'AccuracyCostTradeoff': 123.0,
                     'EnforceProvidedLabels': True|False
                 }
             },
             'EvaluationMetrics': {
                 'TransformType': 'FIND_MATCHES',
                 'FindMatchesMetrics': {
                     'AreaUnderPRCurve': 123.0,
                     'Precision': 123.0,
                     'Recall': 123.0,
                     'F1': 123.0,
                     'ConfusionMatrix': {
                         'NumTruePositives': 123,
                         'NumFalsePositives': 123,
                         'NumTrueNegatives': 123,
                         'NumFalseNegatives': 123
                     },
                     'ColumnImportances': [
                         {
                             'ColumnName': 'string',
                             'Importance': 123.0
                         },
                     ]
                 }
             },
             'LabelCount': 123,
             'Schema': [
                 {
                     'Name': 'string',
                     'DataType': 'string'
                 },
             ],
             'Role': 'string',
             'GlueVersion': 'string',
             'MaxCapacity': 123.0,
             'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
             'NumberOfWorkers': 123,
             'Timeout': 123,
             'MaxRetries': 123,
             'TransformEncryption': {
                 'MlUserDataEncryption': {
                     'MlUserDataEncryptionMode': 'DISABLED'|'SSE-KMS',
                     'KmsKeyId': 'string'
                 },
                 'TaskRunSecurityConfigurationName': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **TransformId** *(string) --*

          The unique identifier of the transform, generated at the
          time that the transform was created.

        * **Name** *(string) --*

          The unique name given to the transform when it was created.

        * **Description** *(string) --*

          A description of the transform.

        * **Status** *(string) --*

          The last known status of the transform (to indicate whether
          it can be used or not). One of "NOT_READY", "READY", or
          "DELETING".

        * **CreatedOn** *(datetime) --*

          The date and time when the transform was created.

        * **LastModifiedOn** *(datetime) --*

          The date and time when the transform was last modified.

        * **InputRecordTables** *(list) --*

          A list of Glue table definitions used by the transform.

          * *(dict) --*

            The database and table in the Glue Data Catalog that is
            used for input or output data.

            * **DatabaseName** *(string) --*

              A database name in the Glue Data Catalog.

            * **TableName** *(string) --*

              A table name in the Glue Data Catalog.

            * **CatalogId** *(string) --*

              A unique identifier for the Glue Data Catalog.

            * **ConnectionName** *(string) --*

              The name of the connection to the Glue Data Catalog.

            * **AdditionalOptions** *(dict) --*

              Additional options for the table. Currently there are
              two keys supported:

              * "pushDownPredicate": to filter on partitions without
                having to list and read all the files in your dataset.

              * "catalogPartitionPredicate": to use server-side
                partition pruning using partition indexes in the Glue
                Data Catalog.

              * *(string) --*

                * *(string) --*

        * **Parameters** *(dict) --*

          The configuration parameters that are specific to the
          algorithm used.

          * **TransformType** *(string) --*

            The type of machine learning transform.

            For information about the types of machine learning
            transforms, see Creating Machine Learning Transforms.

          * **FindMatchesParameters** *(dict) --*

            The parameters for the find matches algorithm.

            * **PrimaryKeyColumnName** *(string) --*

              The name of a column that uniquely identifies rows in
              the source table. Used to help identify matching
              records.

            * **PrecisionRecallTradeoff** *(float) --*

              The value selected when tuning your transform for a
              balance between precision and recall. A value of 0.5
              means no preference; a value of 1.0 means a bias purely
              for precision, and a value of 0.0 means a bias for
              recall. Because this is a tradeoff, choosing values
              close to 1.0 means very low recall, and choosing values
              close to 0.0 results in very low precision.

              The precision metric indicates how often your model is
              correct when it predicts a match.

              The recall metric indicates that for an actual match,
              how often your model predicts the match.

            * **AccuracyCostTradeoff** *(float) --*

              The value that is selected when tuning your transform
              for a balance between accuracy and cost. A value of 0.5
              means that the system balances accuracy and cost
              concerns. A value of 1.0 means a bias purely for
              accuracy, which typically results in a higher cost,
              sometimes substantially higher. A value of 0.0 means a
              bias purely for cost, which results in a less accurate
              "FindMatches" transform, sometimes with unacceptable
              accuracy.

              Accuracy measures how well the transform finds true
              positives and true negatives. Increasing accuracy
              requires more machine resources and cost. But it also
              results in increased recall.

              Cost measures how many compute resources, and thus
              money, are consumed to run the transform.

            * **EnforceProvidedLabels** *(boolean) --*

              The value to switch on or off to force the output to
              match the provided labels from users. If the value is
              "True", the "find matches" transform forces the output
              to match the provided labels. The results override the
              normal conflation results. If the value is "False", the
              "find matches" transform does not ensure all the labels
              provided are respected, and the results rely on the
              trained model.

              Note that setting this value to true may increase the
              conflation execution time.

        * **EvaluationMetrics** *(dict) --*

          The latest evaluation metrics.

          * **TransformType** *(string) --*

            The type of machine learning transform.

          * **FindMatchesMetrics** *(dict) --*

            The evaluation metrics for the find matches algorithm.

            * **AreaUnderPRCurve** *(float) --*

              The area under the precision/recall curve (AUPRC) is a
              single number measuring the overall quality of the
              transform, that is independent of the choice made for
              precision vs. recall. Higher values indicate that you
              have a more attractive precision vs. recall tradeoff.

              For more information, see Precision and recall in
              Wikipedia.

            * **Precision** *(float) --*

              The precision metric indicates when often your transform
              is correct when it predicts a match. Specifically, it
              measures how well the transform finds true positives
              from the total true positives possible.

              For more information, see Precision and recall in
              Wikipedia.

            * **Recall** *(float) --*

              The recall metric indicates that for an actual match,
              how often your transform predicts the match.
              Specifically, it measures how well the transform finds
              true positives from the total records in the source
              data.

              For more information, see Precision and recall in
              Wikipedia.

            * **F1** *(float) --*

              The maximum F1 metric indicates the transform's accuracy
              between 0 and 1, where 1 is the best accuracy.

              For more information, see F1 score in Wikipedia.

            * **ConfusionMatrix** *(dict) --*

              The confusion matrix shows you what your transform is
              predicting accurately and what types of errors it is
              making.

              For more information, see Confusion matrix in Wikipedia.

              * **NumTruePositives** *(integer) --*

                The number of matches in the data that the transform
                correctly found, in the confusion matrix for your
                transform.

              * **NumFalsePositives** *(integer) --*

                The number of nonmatches in the data that the
                transform incorrectly classified as a match, in the
                confusion matrix for your transform.

              * **NumTrueNegatives** *(integer) --*

                The number of nonmatches in the data that the
                transform correctly rejected, in the confusion matrix
                for your transform.

              * **NumFalseNegatives** *(integer) --*

                The number of matches in the data that the transform
                didn't find, in the confusion matrix for your
                transform.

            * **ColumnImportances** *(list) --*

              A list of "ColumnImportance" structures containing
              column importance metrics, sorted in order of descending
              importance.

              * *(dict) --*

                A structure containing the column name and column
                importance score for a column.

                Column importance helps you understand how columns
                contribute to your model, by identifying which columns
                in your records are more important than others.

                * **ColumnName** *(string) --*

                  The name of a column.

                * **Importance** *(float) --*

                  The column importance score for the column, as a
                  decimal.

        * **LabelCount** *(integer) --*

          The number of labels available for this transform.

        * **Schema** *(list) --*

          The "Map<Column, Type>" object that represents the schema
          that this transform accepts. Has an upper bound of 100
          columns.

          * *(dict) --*

            A key-value pair representing a column and data type that
            this transform can run against. The "Schema" parameter of
            the "MLTransform" may contain up to 100 of these
            structures.

            * **Name** *(string) --*

              The name of the column.

            * **DataType** *(string) --*

              The type of data in the column.

        * **Role** *(string) --*

          The name or Amazon Resource Name (ARN) of the IAM role with
          the required permissions.

        * **GlueVersion** *(string) --*

          This value determines which version of Glue this machine
          learning transform is compatible with. Glue 1.0 is
          recommended for most customers. If the value is not set, the
          Glue compatibility defaults to Glue 0.9. For more
          information, see Glue Versions in the developer guide.

        * **MaxCapacity** *(float) --*

          The number of Glue data processing units (DPUs) that are
          allocated to task runs for this transform. You can allocate
          from 2 to 100 DPUs; the default is 10. A DPU is a relative
          measure of processing power that consists of 4 vCPUs of
          compute capacity and 16 GB of memory. For more information,
          see the Glue pricing page.

          When the "WorkerType" field is set to a value other than
          "Standard", the "MaxCapacity" field is set automatically and
          becomes read-only.

        * **WorkerType** *(string) --*

          The type of predefined worker that is allocated when this
          task runs. Accepts a value of Standard, G.1X, or G.2X.

          * For the "Standard" worker type, each worker provides 4
            vCPU, 16 GB of memory and a 50GB disk, and 2 executors per
            worker.

          * For the "G.1X" worker type, each worker provides 4 vCPU,
            16 GB of memory and a 64GB disk, and 1 executor per
            worker.

          * For the "G.2X" worker type, each worker provides 8 vCPU,
            32 GB of memory and a 128GB disk, and 1 executor per
            worker.

        * **NumberOfWorkers** *(integer) --*

          The number of workers of a defined "workerType" that are
          allocated when this task runs.

        * **Timeout** *(integer) --*

          The timeout for a task run for this transform in minutes.
          This is the maximum time that a task run for this transform
          can consume resources before it is terminated and enters
          "TIMEOUT" status. The default is 2,880 minutes (48 hours).

        * **MaxRetries** *(integer) --*

          The maximum number of times to retry a task for this
          transform after a task run fails.

        * **TransformEncryption** *(dict) --*

          The encryption-at-rest settings of the transform that apply
          to accessing user data. Machine learning transforms can
          access user data encrypted in Amazon S3 using KMS.

          * **MlUserDataEncryption** *(dict) --*

            An "MLUserDataEncryption" object containing the encryption
            mode and customer-provided KMS key ID.

            * **MlUserDataEncryptionMode** *(string) --*

              The encryption mode applied to user data. Valid values
              are:

              * DISABLED: encryption is disabled

              * SSEKMS: use of server-side encryption with Key
                Management Service (SSE-KMS) for user data stored in
                Amazon S3.

            * **KmsKeyId** *(string) --*

              The ID for the customer-provided KMS key.

          * **TaskRunSecurityConfigurationName** *(string) --*

            The name of the security configuration.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / delete_schema


delete_schema
*************

Glue.Client.delete_schema(**kwargs)

   Deletes the entire schema set, including the schema set and all of
   its versions. To get the status of the delete operation, you can
   call "GetSchema" API after the asynchronous call. Deleting a
   registry will deactivate all online operations for the schema, such
   as the "GetSchemaByDefinition", and "RegisterSchemaVersion" APIs.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_schema(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          }
      )

   Parameters:
      **SchemaId** (*dict*) --

      **[REQUIRED]**

      This is a wrapper structure that may contain the schema name and
      Amazon Resource Name (ARN).

      * **SchemaArn** *(string) --*

        The Amazon Resource Name (ARN) of the schema. One of
        "SchemaArn" or "SchemaName" has to be provided.

      * **SchemaName** *(string) --*

        The name of the schema. One of "SchemaArn" or "SchemaName" has
        to be provided.

      * **RegistryName** *(string) --*

        The name of the schema registry that contains the schema.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SchemaArn': 'string',
             'SchemaName': 'string',
             'Status': 'AVAILABLE'|'PENDING'|'DELETING'
         }

      **Response Structure**

      * *(dict) --*

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema being deleted.

        * **SchemaName** *(string) --*

          The name of the schema being deleted.

        * **Status** *(string) --*

          The status of the schema.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / get_data_quality_result


get_data_quality_result
***********************

Glue.Client.get_data_quality_result(**kwargs)

   Retrieves the result of a data quality rule evaluation.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_data_quality_result(
          ResultId='string'
      )

   Parameters:
      **ResultId** (*string*) --

      **[REQUIRED]**

      A unique result ID for the data quality result.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ResultId': 'string',
             'ProfileId': 'string',
             'Score': 123.0,
             'DataSource': {
                 'GlueTable': {
                     'DatabaseName': 'string',
                     'TableName': 'string',
                     'CatalogId': 'string',
                     'ConnectionName': 'string',
                     'AdditionalOptions': {
                         'string': 'string'
                     }
                 }
             },
             'RulesetName': 'string',
             'EvaluationContext': 'string',
             'StartedOn': datetime(2015, 1, 1),
             'CompletedOn': datetime(2015, 1, 1),
             'JobName': 'string',
             'JobRunId': 'string',
             'RulesetEvaluationRunId': 'string',
             'RuleResults': [
                 {
                     'Name': 'string',
                     'Description': 'string',
                     'EvaluationMessage': 'string',
                     'Result': 'PASS'|'FAIL'|'ERROR',
                     'EvaluatedMetrics': {
                         'string': 123.0
                     },
                     'EvaluatedRule': 'string',
                     'RuleMetrics': {
                         'string': 123.0
                     }
                 },
             ],
             'AnalyzerResults': [
                 {
                     'Name': 'string',
                     'Description': 'string',
                     'EvaluationMessage': 'string',
                     'EvaluatedMetrics': {
                         'string': 123.0
                     }
                 },
             ],
             'Observations': [
                 {
                     'Description': 'string',
                     'MetricBasedObservation': {
                         'MetricName': 'string',
                         'StatisticId': 'string',
                         'MetricValues': {
                             'ActualValue': 123.0,
                             'ExpectedValue': 123.0,
                             'LowerLimit': 123.0,
                             'UpperLimit': 123.0
                         },
                         'NewRules': [
                             'string',
                         ]
                     }
                 },
             ],
             'AggregatedMetrics': {
                 'TotalRowsProcessed': 123.0,
                 'TotalRowsPassed': 123.0,
                 'TotalRowsFailed': 123.0,
                 'TotalRulesProcessed': 123.0,
                 'TotalRulesPassed': 123.0,
                 'TotalRulesFailed': 123.0
             }
         }

      **Response Structure**

      * *(dict) --*

        The response for the data quality result.

        * **ResultId** *(string) --*

          A unique result ID for the data quality result.

        * **ProfileId** *(string) --*

          The Profile ID for the data quality result.

        * **Score** *(float) --*

          An aggregate data quality score. Represents the ratio of
          rules that passed to the total number of rules.

        * **DataSource** *(dict) --*

          The table associated with the data quality result, if any.

          * **GlueTable** *(dict) --*

            An Glue table.

            * **DatabaseName** *(string) --*

              A database name in the Glue Data Catalog.

            * **TableName** *(string) --*

              A table name in the Glue Data Catalog.

            * **CatalogId** *(string) --*

              A unique identifier for the Glue Data Catalog.

            * **ConnectionName** *(string) --*

              The name of the connection to the Glue Data Catalog.

            * **AdditionalOptions** *(dict) --*

              Additional options for the table. Currently there are
              two keys supported:

              * "pushDownPredicate": to filter on partitions without
                having to list and read all the files in your dataset.

              * "catalogPartitionPredicate": to use server-side
                partition pruning using partition indexes in the Glue
                Data Catalog.

              * *(string) --*

                * *(string) --*

        * **RulesetName** *(string) --*

          The name of the ruleset associated with the data quality
          result.

        * **EvaluationContext** *(string) --*

          In the context of a job in Glue Studio, each node in the
          canvas is typically assigned some sort of name and data
          quality nodes will have names. In the case of multiple
          nodes, the "evaluationContext" can differentiate the nodes.

        * **StartedOn** *(datetime) --*

          The date and time when the run for this data quality result
          started.

        * **CompletedOn** *(datetime) --*

          The date and time when the run for this data quality result
          was completed.

        * **JobName** *(string) --*

          The job name associated with the data quality result, if
          any.

        * **JobRunId** *(string) --*

          The job run ID associated with the data quality result, if
          any.

        * **RulesetEvaluationRunId** *(string) --*

          The unique run ID associated with the ruleset evaluation.

        * **RuleResults** *(list) --*

          A list of "DataQualityRuleResult" objects representing the
          results for each rule.

          * *(dict) --*

            Describes the result of the evaluation of a data quality
            rule.

            * **Name** *(string) --*

              The name of the data quality rule.

            * **Description** *(string) --*

              A description of the data quality rule.

            * **EvaluationMessage** *(string) --*

              An evaluation message.

            * **Result** *(string) --*

              A pass or fail status for the rule.

            * **EvaluatedMetrics** *(dict) --*

              A map of metrics associated with the evaluation of the
              rule.

              * *(string) --*

                * *(float) --*

            * **EvaluatedRule** *(string) --*

              The evaluated rule.

            * **RuleMetrics** *(dict) --*

              A map containing metrics associated with the evaluation
              of the rule based on row-level results.

              * *(string) --*

                * *(float) --*

        * **AnalyzerResults** *(list) --*

          A list of "DataQualityAnalyzerResult" objects representing
          the results for each analyzer.

          * *(dict) --*

            Describes the result of the evaluation of a data quality
            analyzer.

            * **Name** *(string) --*

              The name of the data quality analyzer.

            * **Description** *(string) --*

              A description of the data quality analyzer.

            * **EvaluationMessage** *(string) --*

              An evaluation message.

            * **EvaluatedMetrics** *(dict) --*

              A map of metrics associated with the evaluation of the
              analyzer.

              * *(string) --*

                * *(float) --*

        * **Observations** *(list) --*

          A list of "DataQualityObservation" objects representing the
          observations generated after evaluating the rules and
          analyzers.

          * *(dict) --*

            Describes the observation generated after evaluating the
            rules and analyzers.

            * **Description** *(string) --*

              A description of the data quality observation.

            * **MetricBasedObservation** *(dict) --*

              An object of type "MetricBasedObservation" representing
              the observation that is based on evaluated data quality
              metrics.

              * **MetricName** *(string) --*

                The name of the data quality metric used for
                generating the observation.

              * **StatisticId** *(string) --*

                The Statistic ID.

              * **MetricValues** *(dict) --*

                An object of type "DataQualityMetricValues"
                representing the analysis of the data quality metric
                value.

                * **ActualValue** *(float) --*

                  The actual value of the data quality metric.

                * **ExpectedValue** *(float) --*

                  The expected value of the data quality metric
                  according to the analysis of historical data.

                * **LowerLimit** *(float) --*

                  The lower limit of the data quality metric value
                  according to the analysis of historical data.

                * **UpperLimit** *(float) --*

                  The upper limit of the data quality metric value
                  according to the analysis of historical data.

              * **NewRules** *(list) --*

                A list of new data quality rules generated as part of
                the observation based on the data quality metric
                value.

                * *(string) --*

        * **AggregatedMetrics** *(dict) --*

          A summary of "DataQualityAggregatedMetrics" objects showing
          the total counts of processed rows and rules, including
          their pass/fail statistics based on row-level results.

          * **TotalRowsProcessed** *(float) --*

            The total number of rows that were processed during the
            data quality evaluation.

          * **TotalRowsPassed** *(float) --*

            The total number of rows that passed all applicable data
            quality rules.

          * **TotalRowsFailed** *(float) --*

            The total number of rows that failed one or more data
            quality rules.

          * **TotalRulesProcessed** *(float) --*

            The total number of data quality rules that were
            evaluated.

          * **TotalRulesPassed** *(float) --*

            The total number of data quality rules that passed their
            evaluation criteria.

          * **TotalRulesFailed** *(float) --*

            The total number of data quality rules that failed their
            evaluation criteria.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.EntityNotFoundException"
Glue / Client / describe_connection_type


describe_connection_type
************************

Glue.Client.describe_connection_type(**kwargs)

   The "DescribeConnectionType" API provides full details of the
   supported options for a given connection type in Glue.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.describe_connection_type(
          ConnectionType='string'
      )

   Parameters:
      **ConnectionType** (*string*) --

      **[REQUIRED]**

      The name of the connection type to be described.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ConnectionType': 'string',
             'Description': 'string',
             'Capabilities': {
                 'SupportedAuthenticationTypes': [
                     'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                 ],
                 'SupportedDataOperations': [
                     'READ'|'WRITE',
                 ],
                 'SupportedComputeEnvironments': [
                     'SPARK'|'ATHENA'|'PYTHON',
                 ]
             },
             'ConnectionProperties': {
                 'string': {
                     'Name': 'string',
                     'Description': 'string',
                     'Required': True|False,
                     'DefaultValue': 'string',
                     'PropertyTypes': [
                         'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                     ],
                     'AllowedValues': [
                         {
                             'Description': 'string',
                             'Value': 'string'
                         },
                     ],
                     'DataOperationScopes': [
                         'READ'|'WRITE',
                     ]
                 }
             },
             'ConnectionOptions': {
                 'string': {
                     'Name': 'string',
                     'Description': 'string',
                     'Required': True|False,
                     'DefaultValue': 'string',
                     'PropertyTypes': [
                         'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                     ],
                     'AllowedValues': [
                         {
                             'Description': 'string',
                             'Value': 'string'
                         },
                     ],
                     'DataOperationScopes': [
                         'READ'|'WRITE',
                     ]
                 }
             },
             'AuthenticationConfiguration': {
                 'AuthenticationType': {
                     'Name': 'string',
                     'Description': 'string',
                     'Required': True|False,
                     'DefaultValue': 'string',
                     'PropertyTypes': [
                         'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                     ],
                     'AllowedValues': [
                         {
                             'Description': 'string',
                             'Value': 'string'
                         },
                     ],
                     'DataOperationScopes': [
                         'READ'|'WRITE',
                     ]
                 },
                 'SecretArn': {
                     'Name': 'string',
                     'Description': 'string',
                     'Required': True|False,
                     'DefaultValue': 'string',
                     'PropertyTypes': [
                         'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                     ],
                     'AllowedValues': [
                         {
                             'Description': 'string',
                             'Value': 'string'
                         },
                     ],
                     'DataOperationScopes': [
                         'READ'|'WRITE',
                     ]
                 },
                 'OAuth2Properties': {
                     'string': {
                         'Name': 'string',
                         'Description': 'string',
                         'Required': True|False,
                         'DefaultValue': 'string',
                         'PropertyTypes': [
                             'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                         ],
                         'AllowedValues': [
                             {
                                 'Description': 'string',
                                 'Value': 'string'
                             },
                         ],
                         'DataOperationScopes': [
                             'READ'|'WRITE',
                         ]
                     }
                 },
                 'BasicAuthenticationProperties': {
                     'string': {
                         'Name': 'string',
                         'Description': 'string',
                         'Required': True|False,
                         'DefaultValue': 'string',
                         'PropertyTypes': [
                             'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                         ],
                         'AllowedValues': [
                             {
                                 'Description': 'string',
                                 'Value': 'string'
                             },
                         ],
                         'DataOperationScopes': [
                             'READ'|'WRITE',
                         ]
                     }
                 },
                 'CustomAuthenticationProperties': {
                     'string': {
                         'Name': 'string',
                         'Description': 'string',
                         'Required': True|False,
                         'DefaultValue': 'string',
                         'PropertyTypes': [
                             'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                         ],
                         'AllowedValues': [
                             {
                                 'Description': 'string',
                                 'Value': 'string'
                             },
                         ],
                         'DataOperationScopes': [
                             'READ'|'WRITE',
                         ]
                     }
                 }
             },
             'ComputeEnvironmentConfigurations': {
                 'string': {
                     'Name': 'string',
                     'Description': 'string',
                     'ComputeEnvironment': 'SPARK'|'ATHENA'|'PYTHON',
                     'SupportedAuthenticationTypes': [
                         'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                     ],
                     'ConnectionOptions': {
                         'string': {
                             'Name': 'string',
                             'Description': 'string',
                             'Required': True|False,
                             'DefaultValue': 'string',
                             'PropertyTypes': [
                                 'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                             ],
                             'AllowedValues': [
                                 {
                                     'Description': 'string',
                                     'Value': 'string'
                                 },
                             ],
                             'DataOperationScopes': [
                                 'READ'|'WRITE',
                             ]
                         }
                     },
                     'ConnectionPropertyNameOverrides': {
                         'string': 'string'
                     },
                     'ConnectionOptionNameOverrides': {
                         'string': 'string'
                     },
                     'ConnectionPropertiesRequiredOverrides': [
                         'string',
                     ],
                     'PhysicalConnectionPropertiesRequired': True|False
                 }
             },
             'PhysicalConnectionRequirements': {
                 'string': {
                     'Name': 'string',
                     'Description': 'string',
                     'Required': True|False,
                     'DefaultValue': 'string',
                     'PropertyTypes': [
                         'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                     ],
                     'AllowedValues': [
                         {
                             'Description': 'string',
                             'Value': 'string'
                         },
                     ],
                     'DataOperationScopes': [
                         'READ'|'WRITE',
                     ]
                 }
             },
             'AthenaConnectionProperties': {
                 'string': {
                     'Name': 'string',
                     'Description': 'string',
                     'Required': True|False,
                     'DefaultValue': 'string',
                     'PropertyTypes': [
                         'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                     ],
                     'AllowedValues': [
                         {
                             'Description': 'string',
                             'Value': 'string'
                         },
                     ],
                     'DataOperationScopes': [
                         'READ'|'WRITE',
                     ]
                 }
             },
             'PythonConnectionProperties': {
                 'string': {
                     'Name': 'string',
                     'Description': 'string',
                     'Required': True|False,
                     'DefaultValue': 'string',
                     'PropertyTypes': [
                         'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                     ],
                     'AllowedValues': [
                         {
                             'Description': 'string',
                             'Value': 'string'
                         },
                     ],
                     'DataOperationScopes': [
                         'READ'|'WRITE',
                     ]
                 }
             },
             'SparkConnectionProperties': {
                 'string': {
                     'Name': 'string',
                     'Description': 'string',
                     'Required': True|False,
                     'DefaultValue': 'string',
                     'PropertyTypes': [
                         'USER_INPUT'|'SECRET'|'READ_ONLY'|'UNUSED'|'SECRET_OR_USER_INPUT',
                     ],
                     'AllowedValues': [
                         {
                             'Description': 'string',
                             'Value': 'string'
                         },
                     ],
                     'DataOperationScopes': [
                         'READ'|'WRITE',
                     ]
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **ConnectionType** *(string) --*

          The name of the connection type.

        * **Description** *(string) --*

          A description of the connection type.

        * **Capabilities** *(dict) --*

          The supported authentication types, data interface types
          (compute environments), and data operations of the
          connector.

          * **SupportedAuthenticationTypes** *(list) --*

            A list of supported authentication types.

            * *(string) --*

          * **SupportedDataOperations** *(list) --*

            A list of supported data operations.

            * *(string) --*

          * **SupportedComputeEnvironments** *(list) --*

            A list of supported compute environments.

            * *(string) --*

        * **ConnectionProperties** *(dict) --*

          Connection properties which are common across compute
          environments.

          * *(string) --*

            * *(dict) --*

              An object that defines a connection type for a compute
              environment.

              * **Name** *(string) --*

                The name of the property.

              * **Description** *(string) --*

                A description of the property.

              * **Required** *(boolean) --*

                Indicates whether the property is required.

              * **DefaultValue** *(string) --*

                The default value for the property.

              * **PropertyTypes** *(list) --*

                Describes the type of property.

                * *(string) --*

              * **AllowedValues** *(list) --*

                A list of "AllowedValue" objects representing the
                values allowed for the property.

                * *(dict) --*

                  An object representing a value allowed for a
                  property.

                  * **Description** *(string) --*

                    A description of the allowed value.

                  * **Value** *(string) --*

                    The value allowed for the property.

              * **DataOperationScopes** *(list) --*

                Indicates which data operations are applicable to the
                property.

                * *(string) --*

        * **ConnectionOptions** *(dict) --*

          Returns properties that can be set when creating a
          connection in the "ConnectionInput.ConnectionProperties".
          "ConnectionOptions" defines parameters that can be set in a
          Spark ETL script in the connection options map passed to a
          dataframe.

          * *(string) --*

            * *(dict) --*

              An object that defines a connection type for a compute
              environment.

              * **Name** *(string) --*

                The name of the property.

              * **Description** *(string) --*

                A description of the property.

              * **Required** *(boolean) --*

                Indicates whether the property is required.

              * **DefaultValue** *(string) --*

                The default value for the property.

              * **PropertyTypes** *(list) --*

                Describes the type of property.

                * *(string) --*

              * **AllowedValues** *(list) --*

                A list of "AllowedValue" objects representing the
                values allowed for the property.

                * *(dict) --*

                  An object representing a value allowed for a
                  property.

                  * **Description** *(string) --*

                    A description of the allowed value.

                  * **Value** *(string) --*

                    The value allowed for the property.

              * **DataOperationScopes** *(list) --*

                Indicates which data operations are applicable to the
                property.

                * *(string) --*

        * **AuthenticationConfiguration** *(dict) --*

          The type of authentication used for the connection.

          * **AuthenticationType** *(dict) --*

            The type of authentication for a connection.

            * **Name** *(string) --*

              The name of the property.

            * **Description** *(string) --*

              A description of the property.

            * **Required** *(boolean) --*

              Indicates whether the property is required.

            * **DefaultValue** *(string) --*

              The default value for the property.

            * **PropertyTypes** *(list) --*

              Describes the type of property.

              * *(string) --*

            * **AllowedValues** *(list) --*

              A list of "AllowedValue" objects representing the values
              allowed for the property.

              * *(dict) --*

                An object representing a value allowed for a property.

                * **Description** *(string) --*

                  A description of the allowed value.

                * **Value** *(string) --*

                  The value allowed for the property.

            * **DataOperationScopes** *(list) --*

              Indicates which data operations are applicable to the
              property.

              * *(string) --*

          * **SecretArn** *(dict) --*

            The Amazon Resource Name (ARN) for the Secrets Manager.

            * **Name** *(string) --*

              The name of the property.

            * **Description** *(string) --*

              A description of the property.

            * **Required** *(boolean) --*

              Indicates whether the property is required.

            * **DefaultValue** *(string) --*

              The default value for the property.

            * **PropertyTypes** *(list) --*

              Describes the type of property.

              * *(string) --*

            * **AllowedValues** *(list) --*

              A list of "AllowedValue" objects representing the values
              allowed for the property.

              * *(dict) --*

                An object representing a value allowed for a property.

                * **Description** *(string) --*

                  A description of the allowed value.

                * **Value** *(string) --*

                  The value allowed for the property.

            * **DataOperationScopes** *(list) --*

              Indicates which data operations are applicable to the
              property.

              * *(string) --*

          * **OAuth2Properties** *(dict) --*

            A map of key-value pairs for the OAuth2 properties. Each
            value is a a "Property" object.

            * *(string) --*

              * *(dict) --*

                An object that defines a connection type for a compute
                environment.

                * **Name** *(string) --*

                  The name of the property.

                * **Description** *(string) --*

                  A description of the property.

                * **Required** *(boolean) --*

                  Indicates whether the property is required.

                * **DefaultValue** *(string) --*

                  The default value for the property.

                * **PropertyTypes** *(list) --*

                  Describes the type of property.

                  * *(string) --*

                * **AllowedValues** *(list) --*

                  A list of "AllowedValue" objects representing the
                  values allowed for the property.

                  * *(dict) --*

                    An object representing a value allowed for a
                    property.

                    * **Description** *(string) --*

                      A description of the allowed value.

                    * **Value** *(string) --*

                      The value allowed for the property.

                * **DataOperationScopes** *(list) --*

                  Indicates which data operations are applicable to
                  the property.

                  * *(string) --*

          * **BasicAuthenticationProperties** *(dict) --*

            A map of key-value pairs for the OAuth2 properties. Each
            value is a a "Property" object.

            * *(string) --*

              * *(dict) --*

                An object that defines a connection type for a compute
                environment.

                * **Name** *(string) --*

                  The name of the property.

                * **Description** *(string) --*

                  A description of the property.

                * **Required** *(boolean) --*

                  Indicates whether the property is required.

                * **DefaultValue** *(string) --*

                  The default value for the property.

                * **PropertyTypes** *(list) --*

                  Describes the type of property.

                  * *(string) --*

                * **AllowedValues** *(list) --*

                  A list of "AllowedValue" objects representing the
                  values allowed for the property.

                  * *(dict) --*

                    An object representing a value allowed for a
                    property.

                    * **Description** *(string) --*

                      A description of the allowed value.

                    * **Value** *(string) --*

                      The value allowed for the property.

                * **DataOperationScopes** *(list) --*

                  Indicates which data operations are applicable to
                  the property.

                  * *(string) --*

          * **CustomAuthenticationProperties** *(dict) --*

            A map of key-value pairs for the custom authentication
            properties. Each value is a a "Property" object.

            * *(string) --*

              * *(dict) --*

                An object that defines a connection type for a compute
                environment.

                * **Name** *(string) --*

                  The name of the property.

                * **Description** *(string) --*

                  A description of the property.

                * **Required** *(boolean) --*

                  Indicates whether the property is required.

                * **DefaultValue** *(string) --*

                  The default value for the property.

                * **PropertyTypes** *(list) --*

                  Describes the type of property.

                  * *(string) --*

                * **AllowedValues** *(list) --*

                  A list of "AllowedValue" objects representing the
                  values allowed for the property.

                  * *(dict) --*

                    An object representing a value allowed for a
                    property.

                    * **Description** *(string) --*

                      A description of the allowed value.

                    * **Value** *(string) --*

                      The value allowed for the property.

                * **DataOperationScopes** *(list) --*

                  Indicates which data operations are applicable to
                  the property.

                  * *(string) --*

        * **ComputeEnvironmentConfigurations** *(dict) --*

          The compute environments that are supported by the
          connection.

          * *(string) --*

            * *(dict) --*

              An object containing configuration for a compute
              environment (such as Spark, Python or Athena) returned
              by the "DescribeConnectionType" API.

              * **Name** *(string) --*

                A name for the compute environment configuration.

              * **Description** *(string) --*

                A description of the compute environment.

              * **ComputeEnvironment** *(string) --*

                The type of compute environment.

              * **SupportedAuthenticationTypes** *(list) --*

                The supported authentication types for the compute
                environment.

                * *(string) --*

              * **ConnectionOptions** *(dict) --*

                The parameters used as connection options for the
                compute environment.

                * *(string) --*

                  * *(dict) --*

                    An object that defines a connection type for a
                    compute environment.

                    * **Name** *(string) --*

                      The name of the property.

                    * **Description** *(string) --*

                      A description of the property.

                    * **Required** *(boolean) --*

                      Indicates whether the property is required.

                    * **DefaultValue** *(string) --*

                      The default value for the property.

                    * **PropertyTypes** *(list) --*

                      Describes the type of property.

                      * *(string) --*

                    * **AllowedValues** *(list) --*

                      A list of "AllowedValue" objects representing
                      the values allowed for the property.

                      * *(dict) --*

                        An object representing a value allowed for a
                        property.

                        * **Description** *(string) --*

                          A description of the allowed value.

                        * **Value** *(string) --*

                          The value allowed for the property.

                    * **DataOperationScopes** *(list) --*

                      Indicates which data operations are applicable
                      to the property.

                      * *(string) --*

              * **ConnectionPropertyNameOverrides** *(dict) --*

                The connection property name overrides for the compute
                environment.

                * *(string) --*

                  * *(string) --*

              * **ConnectionOptionNameOverrides** *(dict) --*

                The connection option name overrides for the compute
                environment.

                * *(string) --*

                  * *(string) --*

              * **ConnectionPropertiesRequiredOverrides** *(list) --*

                The connection properties that are required as
                overrides for the compute environment.

                * *(string) --*

              * **PhysicalConnectionPropertiesRequired** *(boolean)
                --*

                Indicates whether "PhysicalConnectionProperties" are
                required for the compute environment.

        * **PhysicalConnectionRequirements** *(dict) --*

          Physical requirements for a connection, such as VPC, Subnet
          and Security Group specifications.

          * *(string) --*

            * *(dict) --*

              An object that defines a connection type for a compute
              environment.

              * **Name** *(string) --*

                The name of the property.

              * **Description** *(string) --*

                A description of the property.

              * **Required** *(boolean) --*

                Indicates whether the property is required.

              * **DefaultValue** *(string) --*

                The default value for the property.

              * **PropertyTypes** *(list) --*

                Describes the type of property.

                * *(string) --*

              * **AllowedValues** *(list) --*

                A list of "AllowedValue" objects representing the
                values allowed for the property.

                * *(dict) --*

                  An object representing a value allowed for a
                  property.

                  * **Description** *(string) --*

                    A description of the allowed value.

                  * **Value** *(string) --*

                    The value allowed for the property.

              * **DataOperationScopes** *(list) --*

                Indicates which data operations are applicable to the
                property.

                * *(string) --*

        * **AthenaConnectionProperties** *(dict) --*

          Connection properties specific to the Athena compute
          environment.

          * *(string) --*

            * *(dict) --*

              An object that defines a connection type for a compute
              environment.

              * **Name** *(string) --*

                The name of the property.

              * **Description** *(string) --*

                A description of the property.

              * **Required** *(boolean) --*

                Indicates whether the property is required.

              * **DefaultValue** *(string) --*

                The default value for the property.

              * **PropertyTypes** *(list) --*

                Describes the type of property.

                * *(string) --*

              * **AllowedValues** *(list) --*

                A list of "AllowedValue" objects representing the
                values allowed for the property.

                * *(dict) --*

                  An object representing a value allowed for a
                  property.

                  * **Description** *(string) --*

                    A description of the allowed value.

                  * **Value** *(string) --*

                    The value allowed for the property.

              * **DataOperationScopes** *(list) --*

                Indicates which data operations are applicable to the
                property.

                * *(string) --*

        * **PythonConnectionProperties** *(dict) --*

          Connection properties specific to the Python compute
          environment.

          * *(string) --*

            * *(dict) --*

              An object that defines a connection type for a compute
              environment.

              * **Name** *(string) --*

                The name of the property.

              * **Description** *(string) --*

                A description of the property.

              * **Required** *(boolean) --*

                Indicates whether the property is required.

              * **DefaultValue** *(string) --*

                The default value for the property.

              * **PropertyTypes** *(list) --*

                Describes the type of property.

                * *(string) --*

              * **AllowedValues** *(list) --*

                A list of "AllowedValue" objects representing the
                values allowed for the property.

                * *(dict) --*

                  An object representing a value allowed for a
                  property.

                  * **Description** *(string) --*

                    A description of the allowed value.

                  * **Value** *(string) --*

                    The value allowed for the property.

              * **DataOperationScopes** *(list) --*

                Indicates which data operations are applicable to the
                property.

                * *(string) --*

        * **SparkConnectionProperties** *(dict) --*

          Connection properties specific to the Spark compute
          environment.

          * *(string) --*

            * *(dict) --*

              An object that defines a connection type for a compute
              environment.

              * **Name** *(string) --*

                The name of the property.

              * **Description** *(string) --*

                A description of the property.

              * **Required** *(boolean) --*

                Indicates whether the property is required.

              * **DefaultValue** *(string) --*

                The default value for the property.

              * **PropertyTypes** *(list) --*

                Describes the type of property.

                * *(string) --*

              * **AllowedValues** *(list) --*

                A list of "AllowedValue" objects representing the
                values allowed for the property.

                * *(dict) --*

                  An object representing a value allowed for a
                  property.

                  * **Description** *(string) --*

                    A description of the allowed value.

                  * **Value** *(string) --*

                    The value allowed for the property.

              * **DataOperationScopes** *(list) --*

                Indicates which data operations are applicable to the
                property.

                * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.AccessDeniedException"
Glue / Client / get_data_quality_ruleset


get_data_quality_ruleset
************************

Glue.Client.get_data_quality_ruleset(**kwargs)

   Returns an existing ruleset by identifier or name.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_data_quality_ruleset(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the ruleset.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string',
             'Description': 'string',
             'Ruleset': 'string',
             'TargetTable': {
                 'TableName': 'string',
                 'DatabaseName': 'string',
                 'CatalogId': 'string'
             },
             'CreatedOn': datetime(2015, 1, 1),
             'LastModifiedOn': datetime(2015, 1, 1),
             'RecommendationRunId': 'string',
             'DataQualitySecurityConfiguration': 'string'
         }

      **Response Structure**

      * *(dict) --*

        Returns the data quality ruleset response.

        * **Name** *(string) --*

          The name of the ruleset.

        * **Description** *(string) --*

          A description of the ruleset.

        * **Ruleset** *(string) --*

          A Data Quality Definition Language (DQDL) ruleset. For more
          information, see the Glue developer guide.

        * **TargetTable** *(dict) --*

          The name and database name of the target table.

          * **TableName** *(string) --*

            The name of the Glue table.

          * **DatabaseName** *(string) --*

            The name of the database where the Glue table exists.

          * **CatalogId** *(string) --*

            The catalog id where the Glue table exists.

        * **CreatedOn** *(datetime) --*

          A timestamp. The time and date that this data quality
          ruleset was created.

        * **LastModifiedOn** *(datetime) --*

          A timestamp. The last point in time when this data quality
          ruleset was modified.

        * **RecommendationRunId** *(string) --*

          When a ruleset was created from a recommendation run, this
          run ID is generated to link the two together.

        * **DataQualitySecurityConfiguration** *(string) --*

          The name of the security configuration created with the data
          quality encryption option.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / update_schema


update_schema
*************

Glue.Client.update_schema(**kwargs)

   Updates the description, compatibility setting, or version
   checkpoint for a schema set.

   For updating the compatibility setting, the call will not validate
   compatibility for the entire set of schema versions with the new
   compatibility setting. If the value for "Compatibility" is
   provided, the "VersionNumber" (a checkpoint) is also required. The
   API will validate the checkpoint version number for consistency.

   If the value for the "VersionNumber" (checkpoint) is provided,
   "Compatibility" is optional and this can be used to set/reset a
   checkpoint for the schema.

   This update will happen only if the schema is in the AVAILABLE
   state.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_schema(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          SchemaVersionNumber={
              'LatestVersion': True|False,
              'VersionNumber': 123
          },
          Compatibility='NONE'|'DISABLED'|'BACKWARD'|'BACKWARD_ALL'|'FORWARD'|'FORWARD_ALL'|'FULL'|'FULL_ALL',
          Description='string'
      )

   Parameters:
      * **SchemaId** (*dict*) --

        **[REQUIRED]**

        This is a wrapper structure to contain schema identity fields.
        The structure contains:

        * SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the
          schema. One of "SchemaArn" or "SchemaName" has to be
          provided.

        * SchemaId$SchemaName: The name of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **SchemaVersionNumber** (*dict*) --

        Version number required for check pointing. One of
        "VersionNumber" or "Compatibility" has to be provided.

        * **LatestVersion** *(boolean) --*

          The latest version available for the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

      * **Compatibility** (*string*) -- The new compatibility setting
        for the schema.

      * **Description** (*string*) -- The new description for the
        schema.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SchemaArn': 'string',
             'SchemaName': 'string',
             'RegistryName': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema.

        * **SchemaName** *(string) --*

          The name of the schema.

        * **RegistryName** *(string) --*

          The name of the registry that contains the schema.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ConcurrentModificationException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / list_column_statistics_task_runs


list_column_statistics_task_runs
********************************

Glue.Client.list_column_statistics_task_runs(**kwargs)

   List all task runs for a particular account.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_column_statistics_task_runs(
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **MaxResults** (*integer*) -- The maximum size of the
        response.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ColumnStatisticsTaskRunIds': [
                 'string',
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **ColumnStatisticsTaskRunIds** *(list) --*

          A list of column statistics task run IDs.

          * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if not all task run IDs have yet been
          returned.

   **Exceptions**

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / list_schemas


list_schemas
************

Glue.Client.list_schemas(**kwargs)

   Returns a list of schemas with minimal details. Schemas in Deleting
   status will not be included in the results. Empty results will be
   returned if there are no schemas available.

   When the "RegistryId" is not provided, all the schemas across
   registries will be part of the API response.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_schemas(
          RegistryId={
              'RegistryName': 'string',
              'RegistryArn': 'string'
          },
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **RegistryId** (*dict*) --

        A wrapper structure that may contain the registry name and
        Amazon Resource Name (ARN).

        * **RegistryName** *(string) --*

          Name of the registry. Used only for lookup. One of
          "RegistryArn" or "RegistryName" has to be provided.

        * **RegistryArn** *(string) --*

          Arn of the registry to be updated. One of "RegistryArn" or
          "RegistryName" has to be provided.

      * **MaxResults** (*integer*) -- Maximum number of results
        required per page. If the value is not supplied, this will be
        defaulted to 25 per page.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Schemas': [
                 {
                     'RegistryName': 'string',
                     'SchemaName': 'string',
                     'SchemaArn': 'string',
                     'Description': 'string',
                     'SchemaStatus': 'AVAILABLE'|'PENDING'|'DELETING',
                     'CreatedTime': 'string',
                     'UpdatedTime': 'string'
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Schemas** *(list) --*

          An array of "SchemaListItem" objects containing details of
          each schema.

          * *(dict) --*

            An object that contains minimal details for a schema.

            * **RegistryName** *(string) --*

              the name of the registry where the schema resides.

            * **SchemaName** *(string) --*

              The name of the schema.

            * **SchemaArn** *(string) --*

              The Amazon Resource Name (ARN) for the schema.

            * **Description** *(string) --*

              A description for the schema.

            * **SchemaStatus** *(string) --*

              The status of the schema.

            * **CreatedTime** *(string) --*

              The date and time that a schema was created.

            * **UpdatedTime** *(string) --*

              The date and time that a schema was updated.

        * **NextToken** *(string) --*

          A continuation token for paginating the returned list of
          tokens, returned if the current segment of the list is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / create_partition


create_partition
****************

Glue.Client.create_partition(**kwargs)

   Creates a new partition.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionInput={
              'Values': [
                  'string',
              ],
              'LastAccessTime': datetime(2015, 1, 1),
              'StorageDescriptor': {
                  'Columns': [
                      {
                          'Name': 'string',
                          'Type': 'string',
                          'Comment': 'string',
                          'Parameters': {
                              'string': 'string'
                          }
                      },
                  ],
                  'Location': 'string',
                  'AdditionalLocations': [
                      'string',
                  ],
                  'InputFormat': 'string',
                  'OutputFormat': 'string',
                  'Compressed': True|False,
                  'NumberOfBuckets': 123,
                  'SerdeInfo': {
                      'Name': 'string',
                      'SerializationLibrary': 'string',
                      'Parameters': {
                          'string': 'string'
                      }
                  },
                  'BucketColumns': [
                      'string',
                  ],
                  'SortColumns': [
                      {
                          'Column': 'string',
                          'SortOrder': 123
                      },
                  ],
                  'Parameters': {
                      'string': 'string'
                  },
                  'SkewedInfo': {
                      'SkewedColumnNames': [
                          'string',
                      ],
                      'SkewedColumnValues': [
                          'string',
                      ],
                      'SkewedColumnValueLocationMaps': {
                          'string': 'string'
                      }
                  },
                  'StoredAsSubDirectories': True|False,
                  'SchemaReference': {
                      'SchemaId': {
                          'SchemaArn': 'string',
                          'SchemaName': 'string',
                          'RegistryName': 'string'
                      },
                      'SchemaVersionId': 'string',
                      'SchemaVersionNumber': 123
                  }
              },
              'Parameters': {
                  'string': 'string'
              },
              'LastAnalyzedTime': datetime(2015, 1, 1)
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The Amazon Web Services account ID
        of the catalog in which the partition is to be created.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the metadata database in which the partition is to
        be created.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the metadata table in which the partition is to be
        created.

      * **PartitionInput** (*dict*) --

        **[REQUIRED]**

        A "PartitionInput" structure defining the partition to be
        created.

        * **Values** *(list) --*

          The values of the partition. Although this parameter is not
          required by the SDK, you must specify this parameter for a
          valid input.

          The values for the keys for the new partition must be passed
          as an array of String objects that must be ordered in the
          same order as the partition keys appearing in the Amazon S3
          prefix. Otherwise Glue will add the values to the wrong
          keys.

          * *(string) --*

        * **LastAccessTime** *(datetime) --*

          The last time at which the partition was accessed.

        * **StorageDescriptor** *(dict) --*

          Provides information about the physical location where the
          partition is stored.

          * **Columns** *(list) --*

            A list of the "Columns" in the table.

            * *(dict) --*

              A column in a "Table".

              * **Name** *(string) --* **[REQUIRED]**

                The name of the "Column".

              * **Type** *(string) --*

                The data type of the "Column".

              * **Comment** *(string) --*

                A free-form text comment.

              * **Parameters** *(dict) --*

                These key-value pairs define properties associated
                with the column.

                * *(string) --*

                  * *(string) --*

          * **Location** *(string) --*

            The physical location of the table. By default, this takes
            the form of the warehouse location, followed by the
            database location in the warehouse, followed by the table
            name.

          * **AdditionalLocations** *(list) --*

            A list of locations that point to the path where a Delta
            table is located.

            * *(string) --*

          * **InputFormat** *(string) --*

            The input format: "SequenceFileInputFormat" (binary), or
            "TextInputFormat", or a custom format.

          * **OutputFormat** *(string) --*

            The output format: "SequenceFileOutputFormat" (binary), or
            "IgnoreKeyTextOutputFormat", or a custom format.

          * **Compressed** *(boolean) --*

            "True" if the data in the table is compressed, or "False"
            if not.

          * **NumberOfBuckets** *(integer) --*

            Must be specified if the table contains any dimension
            columns.

          * **SerdeInfo** *(dict) --*

            The serialization/deserialization (SerDe) information.

            * **Name** *(string) --*

              Name of the SerDe.

            * **SerializationLibrary** *(string) --*

              Usually the class that implements the SerDe. An example
              is
              "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe".

            * **Parameters** *(dict) --*

              These key-value pairs define initialization parameters
              for the SerDe.

              * *(string) --*

                * *(string) --*

          * **BucketColumns** *(list) --*

            A list of reducer grouping columns, clustering columns,
            and bucketing columns in the table.

            * *(string) --*

          * **SortColumns** *(list) --*

            A list specifying the sort order of each bucket in the
            table.

            * *(dict) --*

              Specifies the sort order of a sorted column.

              * **Column** *(string) --* **[REQUIRED]**

                The name of the column.

              * **SortOrder** *(integer) --* **[REQUIRED]**

                Indicates that the column is sorted in ascending order
                ( "== 1"), or in descending order ( "==0").

          * **Parameters** *(dict) --*

            The user-supplied properties in key-value form.

            * *(string) --*

              * *(string) --*

          * **SkewedInfo** *(dict) --*

            The information about values that appear frequently in a
            column (skewed values).

            * **SkewedColumnNames** *(list) --*

              A list of names of columns that contain skewed values.

              * *(string) --*

            * **SkewedColumnValues** *(list) --*

              A list of values that appear so frequently as to be
              considered skewed.

              * *(string) --*

            * **SkewedColumnValueLocationMaps** *(dict) --*

              A mapping of skewed values to the columns that contain
              them.

              * *(string) --*

                * *(string) --*

          * **StoredAsSubDirectories** *(boolean) --*

            "True" if the table data is stored in subdirectories, or
            "False" if not.

          * **SchemaReference** *(dict) --*

            An object that references a schema stored in the Glue
            Schema Registry.

            When creating a table, you can pass an empty list of
            columns for the schema, and instead use a schema
            reference.

            * **SchemaId** *(dict) --*

              A structure that contains schema identity fields. Either
              this or the "SchemaVersionId" has to be provided.

              * **SchemaArn** *(string) --*

                The Amazon Resource Name (ARN) of the schema. One of
                "SchemaArn" or "SchemaName" has to be provided.

              * **SchemaName** *(string) --*

                The name of the schema. One of "SchemaArn" or
                "SchemaName" has to be provided.

              * **RegistryName** *(string) --*

                The name of the schema registry that contains the
                schema.

            * **SchemaVersionId** *(string) --*

              The unique ID assigned to a version of the schema.
              Either this or the "SchemaId" has to be provided.

            * **SchemaVersionNumber** *(integer) --*

              The version number of the schema.

        * **Parameters** *(dict) --*

          These key-value pairs define partition parameters.

          * *(string) --*

            * *(string) --*

        * **LastAnalyzedTime** *(datetime) --*

          The last time at which column statistics were computed for
          this partition.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / update_table_optimizer


update_table_optimizer
**********************

Glue.Client.update_table_optimizer(**kwargs)

   Updates the configuration for an existing table optimizer.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_table_optimizer(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          Type='compaction'|'retention'|'orphan_file_deletion',
          TableOptimizerConfiguration={
              'roleArn': 'string',
              'enabled': True|False,
              'vpcConfiguration': {
                  'glueConnectionName': 'string'
              },
              'compactionConfiguration': {
                  'icebergConfiguration': {
                      'strategy': 'binpack'|'sort'|'z-order',
                      'minInputFiles': 123,
                      'deleteFileThreshold': 123
                  }
              },
              'retentionConfiguration': {
                  'icebergConfiguration': {
                      'snapshotRetentionPeriodInDays': 123,
                      'numberOfSnapshotsToRetain': 123,
                      'cleanExpiredFiles': True|False,
                      'runRateInHours': 123
                  }
              },
              'orphanFileDeletionConfiguration': {
                  'icebergConfiguration': {
                      'orphanFileRetentionPeriodInDays': 123,
                      'location': 'string',
                      'runRateInHours': 123
                  }
              }
          }
      )

   Parameters:
      * **CatalogId** (*string*) --

        **[REQUIRED]**

        The Catalog ID of the table.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database in the catalog in which the table
        resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table.

      * **Type** (*string*) --

        **[REQUIRED]**

        The type of table optimizer.

      * **TableOptimizerConfiguration** (*dict*) --

        **[REQUIRED]**

        A "TableOptimizerConfiguration" object representing the
        configuration of a table optimizer.

        * **roleArn** *(string) --*

          A role passed by the caller which gives the service
          permission to update the resources associated with the
          optimizer on the caller's behalf.

        * **enabled** *(boolean) --*

          Whether table optimization is enabled.

        * **vpcConfiguration** *(dict) --*

          A "TableOptimizerVpcConfiguration" object representing the
          VPC configuration for a table optimizer.

          This configuration is necessary to perform optimization on
          tables that are in a customer VPC.

          Note:

            This is a Tagged Union structure. Only one of the
            following top level keys can be set: "glueConnectionName".

          * **glueConnectionName** *(string) --*

            The name of the Glue connection used for the VPC for the
            table optimizer.

        * **compactionConfiguration** *(dict) --*

          The configuration for a compaction optimizer. This
          configuration defines how data files in your table will be
          compacted to improve query performance and reduce storage
          costs.

          * **icebergConfiguration** *(dict) --*

            The configuration for an Iceberg compaction optimizer.

            * **strategy** *(string) --*

              The strategy to use for compaction. Valid values are:

              * "binpack": Combines small files into larger files,
                typically targeting sizes over 100MB, while applying
                any pending deletes. This is the recommended
                compaction strategy for most use cases.

              * "sort": Organizes data based on specified columns
                which are sorted hierarchically during compaction,
                improving query performance for filtered operations.
                This strategy is recommended when your queries
                frequently filter on specific columns. To use this
                strategy, you must first define a sort order in your
                Iceberg table properties using the "sort_order" table
                property.

              * "z-order": Optimizes data organization by blending
                multiple attributes into a single scalar value that
                can be used for sorting, allowing efficient querying
                across multiple dimensions. This strategy is
                recommended when you need to query data across
                multiple dimensions simultaneously. To use this
                strategy, you must first define a sort order in your
                Iceberg table properties using the "sort_order" table
                property.

              If an input is not provided, the default value 'binpack'
              will be used.

            * **minInputFiles** *(integer) --*

              The minimum number of data files that must be present in
              a partition before compaction will actually compact
              files. This parameter helps control when compaction is
              triggered, preventing unnecessary compaction operations
              on partitions with few files. If an input is not
              provided, the default value 100 will be used.

            * **deleteFileThreshold** *(integer) --*

              The minimum number of deletes that must be present in a
              data file to make it eligible for compaction. This
              parameter helps optimize compaction by focusing on files
              that contain a significant number of delete operations,
              which can improve query performance by removing deleted
              records. If an input is not provided, the default value
              1 will be used.

        * **retentionConfiguration** *(dict) --*

          The configuration for a snapshot retention optimizer.

          * **icebergConfiguration** *(dict) --*

            The configuration for an Iceberg snapshot retention
            optimizer.

            * **snapshotRetentionPeriodInDays** *(integer) --*

              The number of days to retain the Iceberg snapshots. If
              an input is not provided, the corresponding Iceberg
              table configuration field will be used or if not
              present, the default value 5 will be used.

            * **numberOfSnapshotsToRetain** *(integer) --*

              The number of Iceberg snapshots to retain within the
              retention period. If an input is not provided, the
              corresponding Iceberg table configuration field will be
              used or if not present, the default value 1 will be
              used.

            * **cleanExpiredFiles** *(boolean) --*

              If set to false, snapshots are only deleted from table
              metadata, and the underlying data and metadata files are
              not deleted.

            * **runRateInHours** *(integer) --*

              The interval in hours between retention job runs. This
              parameter controls how frequently the retention
              optimizer will run to clean up expired snapshots. The
              value must be between 3 and 168 hours (7 days). If an
              input is not provided, the default value 24 will be
              used.

        * **orphanFileDeletionConfiguration** *(dict) --*

          The configuration for an orphan file deletion optimizer.

          * **icebergConfiguration** *(dict) --*

            The configuration for an Iceberg orphan file deletion
            optimizer.

            * **orphanFileRetentionPeriodInDays** *(integer) --*

              The number of days that orphan files should be retained
              before file deletion. If an input is not provided, the
              default value 3 will be used.

            * **location** *(string) --*

              Specifies a directory in which to look for files
              (defaults to the table's location). You may choose a
              sub-directory rather than the top-level table location.

            * **runRateInHours** *(integer) --*

              The interval in hours between orphan file deletion job
              runs. This parameter controls how frequently the orphan
              file deletion optimizer will run to clean up orphan
              files. The value must be between 3 and 168 hours (7
              days). If an input is not provided, the default value 24
              will be used.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ThrottlingException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / create_workflow


create_workflow
***************

Glue.Client.create_workflow(**kwargs)

   Creates a new workflow.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_workflow(
          Name='string',
          Description='string',
          DefaultRunProperties={
              'string': 'string'
          },
          Tags={
              'string': 'string'
          },
          MaxConcurrentRuns=123
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name to be assigned to the workflow. It should be unique
        within your account.

      * **Description** (*string*) -- A description of the workflow.

      * **DefaultRunProperties** (*dict*) --

        A collection of properties to be used as part of each
        execution of the workflow.

        Run properties may be logged. Do not pass plaintext secrets as
        properties. Retrieve secrets from a Glue Connection, Amazon
        Web Services Secrets Manager or other secret management
        mechanism if you intend to use them within the workflow run.

        * *(string) --*

          * *(string) --*

      * **Tags** (*dict*) --

        The tags to be used with this workflow.

        * *(string) --*

          * *(string) --*

      * **MaxConcurrentRuns** (*integer*) -- You can use this
        parameter to prevent unwanted multiple updates to data, to
        control costs, or in some cases, to prevent exceeding the
        maximum number of concurrent runs of any of the component
        jobs. If you leave this parameter blank, there is no limit to
        the number of concurrent workflow runs.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the workflow which was provided as part of the
          request.

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / get_trigger


get_trigger
***********

Glue.Client.get_trigger(**kwargs)

   Retrieves the definition of a trigger.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_trigger(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the trigger to retrieve.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Trigger': {
                 'Name': 'string',
                 'WorkflowName': 'string',
                 'Id': 'string',
                 'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                 'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                 'Description': 'string',
                 'Schedule': 'string',
                 'Actions': [
                     {
                         'JobName': 'string',
                         'Arguments': {
                             'string': 'string'
                         },
                         'Timeout': 123,
                         'SecurityConfiguration': 'string',
                         'NotificationProperty': {
                             'NotifyDelayAfter': 123
                         },
                         'CrawlerName': 'string'
                     },
                 ],
                 'Predicate': {
                     'Logical': 'AND'|'ANY',
                     'Conditions': [
                         {
                             'LogicalOperator': 'EQUALS',
                             'JobName': 'string',
                             'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                             'CrawlerName': 'string',
                             'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                         },
                     ]
                 },
                 'EventBatchingCondition': {
                     'BatchSize': 123,
                     'BatchWindow': 123
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Trigger** *(dict) --*

          The requested trigger definition.

          * **Name** *(string) --*

            The name of the trigger.

          * **WorkflowName** *(string) --*

            The name of the workflow associated with the trigger.

          * **Id** *(string) --*

            Reserved for future use.

          * **Type** *(string) --*

            The type of trigger that this is.

          * **State** *(string) --*

            The current state of the trigger.

          * **Description** *(string) --*

            A description of this trigger.

          * **Schedule** *(string) --*

            A "cron" expression used to specify the schedule (see
            Time-Based Schedules for Jobs and Crawlers. For example,
            to run something every day at 12:15 UTC, you would
            specify: "cron(15 12 * * ? *)".

          * **Actions** *(list) --*

            The actions initiated by this trigger.

            * *(dict) --*

              Defines an action to be initiated by a trigger.

              * **JobName** *(string) --*

                The name of a job to be run.

              * **Arguments** *(dict) --*

                The job arguments used when this trigger fires. For
                this job run, they replace the default arguments set
                in the job definition itself.

                You can specify arguments here that your own job-
                execution script consumes, as well as arguments that
                Glue itself consumes.

                For information about how to specify and consume your
                own Job arguments, see the Calling Glue APIs in Python
                topic in the developer guide.

                For information about the key-value pairs that Glue
                consumes to set up your job, see the Special
                Parameters Used by Glue topic in the developer guide.

                * *(string) --*

                  * *(string) --*

              * **Timeout** *(integer) --*

                The "JobRun" timeout in minutes. This is the maximum
                time that a job run can consume resources before it is
                terminated and enters "TIMEOUT" status. This overrides
                the timeout value set in the parent job.

                Jobs must have timeout values less than 7 days or
                10080 minutes. Otherwise, the jobs will throw an
                exception.

                When the value is left blank, the timeout is defaulted
                to 2880 minutes.

                Any existing Glue jobs that had a timeout value
                greater than 7 days will be defaulted to 7 days. For
                instance if you have specified a timeout of 20 days
                for a batch job, it will be stopped on the 7th day.

                For streaming jobs, if you have set up a maintenance
                window, it will be restarted during the maintenance
                window after 7 days.

              * **SecurityConfiguration** *(string) --*

                The name of the "SecurityConfiguration" structure to
                be used with this action.

              * **NotificationProperty** *(dict) --*

                Specifies configuration properties of a job run
                notification.

                * **NotifyDelayAfter** *(integer) --*

                  After a job run starts, the number of minutes to
                  wait before sending a job run delay notification.

              * **CrawlerName** *(string) --*

                The name of the crawler to be used with this action.

          * **Predicate** *(dict) --*

            The predicate of this trigger, which defines when it will
            fire.

            * **Logical** *(string) --*

              An optional field if only one condition is listed. If
              multiple conditions are listed, then this field is
              required.

            * **Conditions** *(list) --*

              A list of the conditions that determine when the trigger
              will fire.

              * *(dict) --*

                Defines a condition under which a trigger fires.

                * **LogicalOperator** *(string) --*

                  A logical operator.

                * **JobName** *(string) --*

                  The name of the job whose "JobRuns" this condition
                  applies to, and on which this trigger waits.

                * **State** *(string) --*

                  The condition state. Currently, the only job states
                  that a trigger can listen for are "SUCCEEDED",
                  "STOPPED", "FAILED", and "TIMEOUT". The only crawler
                  states that a trigger can listen for are
                  "SUCCEEDED", "FAILED", and "CANCELLED".

                * **CrawlerName** *(string) --*

                  The name of the crawler to which this condition
                  applies.

                * **CrawlState** *(string) --*

                  The state of the crawler to which this condition
                  applies.

          * **EventBatchingCondition** *(dict) --*

            Batch condition that must be met (specified number of
            events received or batch time window expired) before
            EventBridge event trigger fires.

            * **BatchSize** *(integer) --*

              Number of events that must be received from Amazon
              EventBridge before EventBridge event trigger fires.

            * **BatchWindow** *(integer) --*

              Window of time in seconds after which EventBridge event
              trigger fires. Window starts when first event is
              received.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_statement


get_statement
*************

Glue.Client.get_statement(**kwargs)

   Retrieves the statement.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_statement(
          SessionId='string',
          Id=123,
          RequestOrigin='string'
      )

   Parameters:
      * **SessionId** (*string*) --

        **[REQUIRED]**

        The Session ID of the statement.

      * **Id** (*integer*) --

        **[REQUIRED]**

        The Id of the statement.

      * **RequestOrigin** (*string*) -- The origin of the request.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Statement': {
                 'Id': 123,
                 'Code': 'string',
                 'State': 'WAITING'|'RUNNING'|'AVAILABLE'|'CANCELLING'|'CANCELLED'|'ERROR',
                 'Output': {
                     'Data': {
                         'TextPlain': 'string'
                     },
                     'ExecutionCount': 123,
                     'Status': 'WAITING'|'RUNNING'|'AVAILABLE'|'CANCELLING'|'CANCELLED'|'ERROR',
                     'ErrorName': 'string',
                     'ErrorValue': 'string',
                     'Traceback': [
                         'string',
                     ]
                 },
                 'Progress': 123.0,
                 'StartedOn': 123,
                 'CompletedOn': 123
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Statement** *(dict) --*

          Returns the statement.

          * **Id** *(integer) --*

            The ID of the statement.

          * **Code** *(string) --*

            The execution code of the statement.

          * **State** *(string) --*

            The state while request is actioned.

          * **Output** *(dict) --*

            The output in JSON.

            * **Data** *(dict) --*

              The code execution output.

              * **TextPlain** *(string) --*

                The code execution output in text format.

            * **ExecutionCount** *(integer) --*

              The execution count of the output.

            * **Status** *(string) --*

              The status of the code execution output.

            * **ErrorName** *(string) --*

              The name of the error in the output.

            * **ErrorValue** *(string) --*

              The error value of the output.

            * **Traceback** *(list) --*

              The traceback of the output.

              * *(string) --*

          * **Progress** *(float) --*

            The code execution progress.

          * **StartedOn** *(integer) --*

            The unix time and date that the job definition was
            started.

          * **CompletedOn** *(integer) --*

            The unix time and date that the job definition was
            completed.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.IllegalSessionStateException"
Glue / Client / start_crawler_schedule


start_crawler_schedule
**********************

Glue.Client.start_crawler_schedule(**kwargs)

   Changes the schedule state of the specified crawler to "SCHEDULED",
   unless the crawler is already running or the schedule state is
   already "SCHEDULED".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_crawler_schedule(
          CrawlerName='string'
      )

   Parameters:
      **CrawlerName** (*string*) --

      **[REQUIRED]**

      Name of the crawler to schedule.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.SchedulerRunningException"

   * "Glue.Client.exceptions.SchedulerTransitioningException"

   * "Glue.Client.exceptions.NoScheduleException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / batch_delete_partition


batch_delete_partition
**********************

Glue.Client.batch_delete_partition(**kwargs)

   Deletes one or more partitions in a batch operation.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_delete_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionsToDelete=[
              {
                  'Values': [
                      'string',
                  ]
              },
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partition to be deleted resides. If none is provided, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database in which the table in
        question resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table that contains the partitions to be
        deleted.

      * **PartitionsToDelete** (*list*) --

        **[REQUIRED]**

        A list of "PartitionInput" structures that define the
        partitions to be deleted.

        * *(dict) --*

          Contains a list of values defining partitions.

          * **Values** *(list) --* **[REQUIRED]**

            The list of values.

            * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Errors': [
                 {
                     'PartitionValues': [
                         'string',
                     ],
                     'ErrorDetail': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Errors** *(list) --*

          The errors encountered when trying to delete the requested
          partitions.

          * *(dict) --*

            Contains information about a partition error.

            * **PartitionValues** *(list) --*

              The values that define the partition.

              * *(string) --*

            * **ErrorDetail** *(dict) --*

              The details about the partition error.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_workflow_run


get_workflow_run
****************

Glue.Client.get_workflow_run(**kwargs)

   Retrieves the metadata for a given workflow run. Job run history is
   accessible for 90 days for your workflow and job run.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_workflow_run(
          Name='string',
          RunId='string',
          IncludeGraph=True|False
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        Name of the workflow being run.

      * **RunId** (*string*) --

        **[REQUIRED]**

        The ID of the workflow run.

      * **IncludeGraph** (*boolean*) -- Specifies whether to include
        the workflow graph in response or not.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Run': {
                 'Name': 'string',
                 'WorkflowRunId': 'string',
                 'PreviousRunId': 'string',
                 'WorkflowRunProperties': {
                     'string': 'string'
                 },
                 'StartedOn': datetime(2015, 1, 1),
                 'CompletedOn': datetime(2015, 1, 1),
                 'Status': 'RUNNING'|'COMPLETED'|'STOPPING'|'STOPPED'|'ERROR',
                 'ErrorMessage': 'string',
                 'Statistics': {
                     'TotalActions': 123,
                     'TimeoutActions': 123,
                     'FailedActions': 123,
                     'StoppedActions': 123,
                     'SucceededActions': 123,
                     'RunningActions': 123,
                     'ErroredActions': 123,
                     'WaitingActions': 123
                 },
                 'Graph': {
                     'Nodes': [
                         {
                             'Type': 'CRAWLER'|'JOB'|'TRIGGER',
                             'Name': 'string',
                             'UniqueId': 'string',
                             'TriggerDetails': {
                                 'Trigger': {
                                     'Name': 'string',
                                     'WorkflowName': 'string',
                                     'Id': 'string',
                                     'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                                     'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                                     'Description': 'string',
                                     'Schedule': 'string',
                                     'Actions': [
                                         {
                                             'JobName': 'string',
                                             'Arguments': {
                                                 'string': 'string'
                                             },
                                             'Timeout': 123,
                                             'SecurityConfiguration': 'string',
                                             'NotificationProperty': {
                                                 'NotifyDelayAfter': 123
                                             },
                                             'CrawlerName': 'string'
                                         },
                                     ],
                                     'Predicate': {
                                         'Logical': 'AND'|'ANY',
                                         'Conditions': [
                                             {
                                                 'LogicalOperator': 'EQUALS',
                                                 'JobName': 'string',
                                                 'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                                 'CrawlerName': 'string',
                                                 'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                                             },
                                         ]
                                     },
                                     'EventBatchingCondition': {
                                         'BatchSize': 123,
                                         'BatchWindow': 123
                                     }
                                 }
                             },
                             'JobDetails': {
                                 'JobRuns': [
                                     {
                                         'Id': 'string',
                                         'Attempt': 123,
                                         'PreviousRunId': 'string',
                                         'TriggerName': 'string',
                                         'JobName': 'string',
                                         'JobMode': 'SCRIPT'|'VISUAL'|'NOTEBOOK',
                                         'JobRunQueuingEnabled': True|False,
                                         'StartedOn': datetime(2015, 1, 1),
                                         'LastModifiedOn': datetime(2015, 1, 1),
                                         'CompletedOn': datetime(2015, 1, 1),
                                         'JobRunState': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                         'Arguments': {
                                             'string': 'string'
                                         },
                                         'ErrorMessage': 'string',
                                         'PredecessorRuns': [
                                             {
                                                 'JobName': 'string',
                                                 'RunId': 'string'
                                             },
                                         ],
                                         'AllocatedCapacity': 123,
                                         'ExecutionTime': 123,
                                         'Timeout': 123,
                                         'MaxCapacity': 123.0,
                                         'WorkerType': 'Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
                                         'NumberOfWorkers': 123,
                                         'SecurityConfiguration': 'string',
                                         'LogGroupName': 'string',
                                         'NotificationProperty': {
                                             'NotifyDelayAfter': 123
                                         },
                                         'GlueVersion': 'string',
                                         'DPUSeconds': 123.0,
                                         'ExecutionClass': 'FLEX'|'STANDARD',
                                         'MaintenanceWindow': 'string',
                                         'ProfileName': 'string',
                                         'StateDetail': 'string',
                                         'ExecutionRoleSessionPolicy': 'string'
                                     },
                                 ]
                             },
                             'CrawlerDetails': {
                                 'Crawls': [
                                     {
                                         'State': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR',
                                         'StartedOn': datetime(2015, 1, 1),
                                         'CompletedOn': datetime(2015, 1, 1),
                                         'ErrorMessage': 'string',
                                         'LogGroup': 'string',
                                         'LogStream': 'string'
                                     },
                                 ]
                             }
                         },
                     ],
                     'Edges': [
                         {
                             'SourceId': 'string',
                             'DestinationId': 'string'
                         },
                     ]
                 },
                 'StartingEventBatchCondition': {
                     'BatchSize': 123,
                     'BatchWindow': 123
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Run** *(dict) --*

          The requested workflow run metadata.

          * **Name** *(string) --*

            Name of the workflow that was run.

          * **WorkflowRunId** *(string) --*

            The ID of this workflow run.

          * **PreviousRunId** *(string) --*

            The ID of the previous workflow run.

          * **WorkflowRunProperties** *(dict) --*

            The workflow run properties which were set during the run.

            * *(string) --*

              * *(string) --*

          * **StartedOn** *(datetime) --*

            The date and time when the workflow run was started.

          * **CompletedOn** *(datetime) --*

            The date and time when the workflow run completed.

          * **Status** *(string) --*

            The status of the workflow run.

          * **ErrorMessage** *(string) --*

            This error message describes any error that may have
            occurred in starting the workflow run. Currently the only
            error message is "Concurrent runs exceeded for workflow:
            "foo"."

          * **Statistics** *(dict) --*

            The statistics of the run.

            * **TotalActions** *(integer) --*

              Total number of Actions in the workflow run.

            * **TimeoutActions** *(integer) --*

              Total number of Actions that timed out.

            * **FailedActions** *(integer) --*

              Total number of Actions that have failed.

            * **StoppedActions** *(integer) --*

              Total number of Actions that have stopped.

            * **SucceededActions** *(integer) --*

              Total number of Actions that have succeeded.

            * **RunningActions** *(integer) --*

              Total number Actions in running state.

            * **ErroredActions** *(integer) --*

              Indicates the count of job runs in the ERROR state in
              the workflow run.

            * **WaitingActions** *(integer) --*

              Indicates the count of job runs in WAITING state in the
              workflow run.

          * **Graph** *(dict) --*

            The graph representing all the Glue components that belong
            to the workflow as nodes and directed connections between
            them as edges.

            * **Nodes** *(list) --*

              A list of the the Glue components belong to the workflow
              represented as nodes.

              * *(dict) --*

                A node represents an Glue component (trigger, crawler,
                or job) on a workflow graph.

                * **Type** *(string) --*

                  The type of Glue component represented by the node.

                * **Name** *(string) --*

                  The name of the Glue component represented by the
                  node.

                * **UniqueId** *(string) --*

                  The unique Id assigned to the node within the
                  workflow.

                * **TriggerDetails** *(dict) --*

                  Details of the Trigger when the node represents a
                  Trigger.

                  * **Trigger** *(dict) --*

                    The information of the trigger represented by the
                    trigger node.

                    * **Name** *(string) --*

                      The name of the trigger.

                    * **WorkflowName** *(string) --*

                      The name of the workflow associated with the
                      trigger.

                    * **Id** *(string) --*

                      Reserved for future use.

                    * **Type** *(string) --*

                      The type of trigger that this is.

                    * **State** *(string) --*

                      The current state of the trigger.

                    * **Description** *(string) --*

                      A description of this trigger.

                    * **Schedule** *(string) --*

                      A "cron" expression used to specify the schedule
                      (see Time-Based Schedules for Jobs and Crawlers.
                      For example, to run something every day at 12:15
                      UTC, you would specify: "cron(15 12 * * ? *)".

                    * **Actions** *(list) --*

                      The actions initiated by this trigger.

                      * *(dict) --*

                        Defines an action to be initiated by a
                        trigger.

                        * **JobName** *(string) --*

                          The name of a job to be run.

                        * **Arguments** *(dict) --*

                          The job arguments used when this trigger
                          fires. For this job run, they replace the
                          default arguments set in the job definition
                          itself.

                          You can specify arguments here that your own
                          job-execution script consumes, as well as
                          arguments that Glue itself consumes.

                          For information about how to specify and
                          consume your own Job arguments, see the
                          Calling Glue APIs in Python topic in the
                          developer guide.

                          For information about the key-value pairs
                          that Glue consumes to set up your job, see
                          the Special Parameters Used by Glue topic in
                          the developer guide.

                          * *(string) --*

                            * *(string) --*

                        * **Timeout** *(integer) --*

                          The "JobRun" timeout in minutes. This is the
                          maximum time that a job run can consume
                          resources before it is terminated and enters
                          "TIMEOUT" status. This overrides the timeout
                          value set in the parent job.

                          Jobs must have timeout values less than 7
                          days or 10080 minutes. Otherwise, the jobs
                          will throw an exception.

                          When the value is left blank, the timeout is
                          defaulted to 2880 minutes.

                          Any existing Glue jobs that had a timeout
                          value greater than 7 days will be defaulted
                          to 7 days. For instance if you have
                          specified a timeout of 20 days for a batch
                          job, it will be stopped on the 7th day.

                          For streaming jobs, if you have set up a
                          maintenance window, it will be restarted
                          during the maintenance window after 7 days.

                        * **SecurityConfiguration** *(string) --*

                          The name of the "SecurityConfiguration"
                          structure to be used with this action.

                        * **NotificationProperty** *(dict) --*

                          Specifies configuration properties of a job
                          run notification.

                          * **NotifyDelayAfter** *(integer) --*

                            After a job run starts, the number of
                            minutes to wait before sending a job run
                            delay notification.

                        * **CrawlerName** *(string) --*

                          The name of the crawler to be used with this
                          action.

                    * **Predicate** *(dict) --*

                      The predicate of this trigger, which defines
                      when it will fire.

                      * **Logical** *(string) --*

                        An optional field if only one condition is
                        listed. If multiple conditions are listed,
                        then this field is required.

                      * **Conditions** *(list) --*

                        A list of the conditions that determine when
                        the trigger will fire.

                        * *(dict) --*

                          Defines a condition under which a trigger
                          fires.

                          * **LogicalOperator** *(string) --*

                            A logical operator.

                          * **JobName** *(string) --*

                            The name of the job whose "JobRuns" this
                            condition applies to, and on which this
                            trigger waits.

                          * **State** *(string) --*

                            The condition state. Currently, the only
                            job states that a trigger can listen for
                            are "SUCCEEDED", "STOPPED", "FAILED", and
                            "TIMEOUT". The only crawler states that a
                            trigger can listen for are "SUCCEEDED",
                            "FAILED", and "CANCELLED".

                          * **CrawlerName** *(string) --*

                            The name of the crawler to which this
                            condition applies.

                          * **CrawlState** *(string) --*

                            The state of the crawler to which this
                            condition applies.

                    * **EventBatchingCondition** *(dict) --*

                      Batch condition that must be met (specified
                      number of events received or batch time window
                      expired) before EventBridge event trigger fires.

                      * **BatchSize** *(integer) --*

                        Number of events that must be received from
                        Amazon EventBridge before EventBridge event
                        trigger fires.

                      * **BatchWindow** *(integer) --*

                        Window of time in seconds after which
                        EventBridge event trigger fires. Window starts
                        when first event is received.

                * **JobDetails** *(dict) --*

                  Details of the Job when the node represents a Job.

                  * **JobRuns** *(list) --*

                    The information for the job runs represented by
                    the job node.

                    * *(dict) --*

                      Contains information about a job run.

                      * **Id** *(string) --*

                        The ID of this job run.

                      * **Attempt** *(integer) --*

                        The number of the attempt to run this job.

                      * **PreviousRunId** *(string) --*

                        The ID of the previous run of this job. For
                        example, the "JobRunId" specified in the
                        "StartJobRun" action.

                      * **TriggerName** *(string) --*

                        The name of the trigger that started this job
                        run.

                      * **JobName** *(string) --*

                        The name of the job definition being used in
                        this run.

                      * **JobMode** *(string) --*

                        A mode that describes how a job was created.
                        Valid values are:

                        * "SCRIPT" - The job was created using the
                          Glue Studio script editor.

                        * "VISUAL" - The job was created using the
                          Glue Studio visual editor.

                        * "NOTEBOOK" - The job was created using an
                          interactive sessions notebook.

                        When the "JobMode" field is missing or null,
                        "SCRIPT" is assigned as the default value.

                      * **JobRunQueuingEnabled** *(boolean) --*

                        Specifies whether job run queuing is enabled
                        for the job run.

                        A value of true means job run queuing is
                        enabled for the job run. If false or not
                        populated, the job run will not be considered
                        for queueing.

                      * **StartedOn** *(datetime) --*

                        The date and time at which this job run was
                        started.

                      * **LastModifiedOn** *(datetime) --*

                        The last time that this job run was modified.

                      * **CompletedOn** *(datetime) --*

                        The date and time that this job run completed.

                      * **JobRunState** *(string) --*

                        The current state of the job run. For more
                        information about the statuses of jobs that
                        have terminated abnormally, see Glue Job Run
                        Statuses.

                      * **Arguments** *(dict) --*

                        The job arguments associated with this run.
                        For this job run, they replace the default
                        arguments set in the job definition itself.

                        You can specify arguments here that your own
                        job-execution script consumes, as well as
                        arguments that Glue itself consumes.

                        Job arguments may be logged. Do not pass
                        plaintext secrets as arguments. Retrieve
                        secrets from a Glue Connection, Secrets
                        Manager or other secret management mechanism
                        if you intend to keep them within the Job.

                        For information about how to specify and
                        consume your own Job arguments, see the
                        Calling Glue APIs in Python topic in the
                        developer guide.

                        For information about the arguments you can
                        provide to this field when configuring Spark
                        jobs, see the Special Parameters Used by Glue
                        topic in the developer guide.

                        For information about the arguments you can
                        provide to this field when configuring Ray
                        jobs, see Using job parameters in Ray jobs in
                        the developer guide.

                        * *(string) --*

                          * *(string) --*

                      * **ErrorMessage** *(string) --*

                        An error message associated with this job run.

                      * **PredecessorRuns** *(list) --*

                        A list of predecessors to this job run.

                        * *(dict) --*

                          A job run that was used in the predicate of
                          a conditional trigger that triggered this
                          job run.

                          * **JobName** *(string) --*

                            The name of the job definition used by the
                            predecessor job run.

                          * **RunId** *(string) --*

                            The job-run ID of the predecessor job run.

                      * **AllocatedCapacity** *(integer) --*

                        This field is deprecated. Use "MaxCapacity"
                        instead.

                        The number of Glue data processing units
                        (DPUs) allocated to this JobRun. From 2 to 100
                        DPUs can be allocated; the default is 10. A
                        DPU is a relative measure of processing power
                        that consists of 4 vCPUs of compute capacity
                        and 16 GB of memory. For more information, see
                        the Glue pricing page.

                      * **ExecutionTime** *(integer) --*

                        The amount of time (in seconds) that the job
                        run consumed resources.

                      * **Timeout** *(integer) --*

                        The "JobRun" timeout in minutes. This is the
                        maximum time that a job run can consume
                        resources before it is terminated and enters
                        "TIMEOUT" status. This value overrides the
                        timeout value set in the parent job.

                        Jobs must have timeout values less than 7 days
                        or 10080 minutes. Otherwise, the jobs will
                        throw an exception.

                        When the value is left blank, the timeout is
                        defaulted to 2880 minutes.

                        Any existing Glue jobs that had a timeout
                        value greater than 7 days will be defaulted to
                        7 days. For instance if you have specified a
                        timeout of 20 days for a batch job, it will be
                        stopped on the 7th day.

                        For streaming jobs, if you have set up a
                        maintenance window, it will be restarted
                        during the maintenance window after 7 days.

                      * **MaxCapacity** *(float) --*

                        For Glue version 1.0 or earlier jobs, using
                        the standard worker type, the number of Glue
                        data processing units (DPUs) that can be
                        allocated when this job runs. A DPU is a
                        relative measure of processing power that
                        consists of 4 vCPUs of compute capacity and 16
                        GB of memory. For more information, see the
                        Glue pricing page.

                        For Glue version 2.0+ jobs, you cannot specify
                        a "Maximum capacity". Instead, you should
                        specify a "Worker type" and the "Number of
                        workers".

                        Do not set "MaxCapacity" if using "WorkerType"
                        and "NumberOfWorkers".

                        The value that can be allocated for
                        "MaxCapacity" depends on whether you are
                        running a Python shell job, an Apache Spark
                        ETL job, or an Apache Spark streaming ETL job:

                        * When you specify a Python shell job (
                          >>``<<JobCommand.Name``="pythonshell"), you
                          can allocate either 0.0625 or 1 DPU. The
                          default is 0.0625 DPU.

                        * When you specify an Apache Spark ETL job (
                          >>``<<JobCommand.Name``="glueetl") or Apache
                          Spark streaming ETL job (
                          >>``<<JobCommand.Name``="gluestreaming"),
                          you can allocate from 2 to 100 DPUs. The
                          default is 10 DPUs. This job type cannot
                          have a fractional DPU allocation.

                      * **WorkerType** *(string) --*

                        The type of predefined worker that is
                        allocated when a job runs. Accepts a value of
                        G.1X, G.2X, G.4X, G.8X or G.025X for Spark
                        jobs. Accepts the value Z.2X for Ray jobs.

                        * For the "G.1X" worker type, each worker maps
                          to 1 DPU (4 vCPUs, 16 GB of memory) with
                          94GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          workloads such as data transforms, joins,
                          and queries, to offers a scalable and cost
                          effective way to run most jobs.

                        * For the "G.2X" worker type, each worker maps
                          to 2 DPU (8 vCPUs, 32 GB of memory) with
                          138GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          workloads such as data transforms, joins,
                          and queries, to offers a scalable and cost
                          effective way to run most jobs.

                        * For the "G.4X" worker type, each worker maps
                          to 4 DPU (16 vCPUs, 64 GB of memory) with
                          256GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          jobs whose workloads contain your most
                          demanding transforms, aggregations, joins,
                          and queries. This worker type is available
                          only for Glue version 3.0 or later Spark ETL
                          jobs in the following Amazon Web Services
                          Regions: US East (Ohio), US East (N.
                          Virginia), US West (Oregon), Asia Pacific
                          (Singapore), Asia Pacific (Sydney), Asia
                          Pacific (Tokyo), Canada (Central), Europe
                          (Frankfurt), Europe (Ireland), and Europe
                          (Stockholm).

                        * For the "G.8X" worker type, each worker maps
                          to 8 DPU (32 vCPUs, 128 GB of memory) with
                          512GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          jobs whose workloads contain your most
                          demanding transforms, aggregations, joins,
                          and queries. This worker type is available
                          only for Glue version 3.0 or later Spark ETL
                          jobs, in the same Amazon Web Services
                          Regions as supported for the "G.4X" worker
                          type.

                        * For the "G.025X" worker type, each worker
                          maps to 0.25 DPU (2 vCPUs, 4 GB of memory)
                          with 84GB disk, and provides 1 executor per
                          worker. We recommend this worker type for
                          low volume streaming jobs. This worker type
                          is only available for Glue version 3.0 or
                          later streaming jobs.

                        * For the "Z.2X" worker type, each worker maps
                          to 2 M-DPU (8vCPUs, 64 GB of memory) with
                          128 GB disk, and provides up to 8 Ray
                          workers based on the autoscaler.

                      * **NumberOfWorkers** *(integer) --*

                        The number of workers of a defined
                        "workerType" that are allocated when a job
                        runs.

                      * **SecurityConfiguration** *(string) --*

                        The name of the "SecurityConfiguration"
                        structure to be used with this job run.

                      * **LogGroupName** *(string) --*

                        The name of the log group for secure logging
                        that can be server-side encrypted in Amazon
                        CloudWatch using KMS. This name can be "/aws-
                        glue/jobs/", in which case the default
                        encryption is "NONE". If you add a role name
                        and "SecurityConfiguration" name (in other
                        words, "/aws-glue/jobs-yourRoleName-
                        yourSecurityConfigurationName/"), then that
                        security configuration is used to encrypt the
                        log group.

                      * **NotificationProperty** *(dict) --*

                        Specifies configuration properties of a job
                        run notification.

                        * **NotifyDelayAfter** *(integer) --*

                          After a job run starts, the number of
                          minutes to wait before sending a job run
                          delay notification.

                      * **GlueVersion** *(string) --*

                        In Spark jobs, "GlueVersion" determines the
                        versions of Apache Spark and Python that Glue
                        available in a job. The Python version
                        indicates the version supported for jobs of
                        type Spark.

                        Ray jobs should set "GlueVersion" to "4.0" or
                        greater. However, the versions of Ray, Python
                        and additional libraries available in your Ray
                        job are determined by the "Runtime" parameter
                        of the Job command.

                        For more information about the available Glue
                        versions and corresponding Spark and Python
                        versions, see Glue version in the developer
                        guide.

                        Jobs that are created without specifying a
                        Glue version default to Glue 0.9.

                      * **DPUSeconds** *(float) --*

                        This field can be set for either job runs with
                        execution class "FLEX" or when Auto Scaling is
                        enabled, and represents the total time each
                        executor ran during the lifecycle of a job run
                        in seconds, multiplied by a DPU factor (1 for
                        "G.1X", 2 for "G.2X", or 0.25 for "G.025X"
                        workers). This value may be different than the
                        "executionEngineRuntime" * "MaxCapacity" as in
                        the case of Auto Scaling jobs, as the number
                        of executors running at a given time may be
                        less than the "MaxCapacity". Therefore, it is
                        possible that the value of "DPUSeconds" is
                        less than "executionEngineRuntime" *
                        "MaxCapacity".

                      * **ExecutionClass** *(string) --*

                        Indicates whether the job is run with a
                        standard or flexible execution class. The
                        standard execution-class is ideal for time-
                        sensitive workloads that require fast job
                        startup and dedicated resources.

                        The flexible execution class is appropriate
                        for time-insensitive jobs whose start and
                        completion times may vary.

                        Only jobs with Glue version 3.0 and above and
                        command type "glueetl" will be allowed to set
                        "ExecutionClass" to "FLEX". The flexible
                        execution class is available for Spark jobs.

                      * **MaintenanceWindow** *(string) --*

                        This field specifies a day of the week and
                        hour for a maintenance window for streaming
                        jobs. Glue periodically performs maintenance
                        activities. During these maintenance windows,
                        Glue will need to restart your streaming jobs.

                        Glue will restart the job within 3 hours of
                        the specified maintenance window. For
                        instance, if you set up the maintenance window
                        for Monday at 10:00AM GMT, your jobs will be
                        restarted between 10:00AM GMT to 1:00PM GMT.

                      * **ProfileName** *(string) --*

                        The name of an Glue usage profile associated
                        with the job run.

                      * **StateDetail** *(string) --*

                        This field holds details that pertain to the
                        state of a job run. The field is nullable.

                        For example, when a job run is in a WAITING
                        state as a result of job run queuing, the
                        field has the reason why the job run is in
                        that state.

                      * **ExecutionRoleSessionPolicy** *(string) --*

                        This inline session policy to the StartJobRun
                        API allows you to dynamically restrict the
                        permissions of the specified execution role
                        for the scope of the job, without requiring
                        the creation of additional IAM roles.

                * **CrawlerDetails** *(dict) --*

                  Details of the crawler when the node represents a
                  crawler.

                  * **Crawls** *(list) --*

                    A list of crawls represented by the crawl node.

                    * *(dict) --*

                      The details of a crawl in the workflow.

                      * **State** *(string) --*

                        The state of the crawler.

                      * **StartedOn** *(datetime) --*

                        The date and time on which the crawl started.

                      * **CompletedOn** *(datetime) --*

                        The date and time on which the crawl
                        completed.

                      * **ErrorMessage** *(string) --*

                        The error message associated with the crawl.

                      * **LogGroup** *(string) --*

                        The log group associated with the crawl.

                      * **LogStream** *(string) --*

                        The log stream associated with the crawl.

            * **Edges** *(list) --*

              A list of all the directed connections between the nodes
              belonging to the workflow.

              * *(dict) --*

                An edge represents a directed connection between two
                Glue components that are part of the workflow the edge
                belongs to.

                * **SourceId** *(string) --*

                  The unique of the node within the workflow where the
                  edge starts.

                * **DestinationId** *(string) --*

                  The unique of the node within the workflow where the
                  edge ends.

          * **StartingEventBatchCondition** *(dict) --*

            The batch condition that started the workflow run.

            * **BatchSize** *(integer) --*

              Number of events in the batch.

            * **BatchWindow** *(integer) --*

              Duration of the batch window in seconds.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_connections


get_connections
***************

Glue.Client.get_connections(**kwargs)

   Retrieves a list of connection definitions from the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_connections(
          CatalogId='string',
          Filter={
              'MatchCriteria': [
                  'string',
              ],
              'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
              'ConnectionSchemaVersion': 123
          },
          HidePassword=True|False,
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which the connections reside. If none is provided, the Amazon
        Web Services account ID is used by default.

      * **Filter** (*dict*) --

        A filter that controls which connections are returned.

        * **MatchCriteria** *(list) --*

          A criteria string that must match the criteria recorded in
          the connection definition for that connection definition to
          be returned.

          * *(string) --*

        * **ConnectionType** *(string) --*

          The type of connections to return. Currently, SFTP is not
          supported.

        * **ConnectionSchemaVersion** *(integer) --*

          Denotes if the connection was created with schema version 1
          or 2.

      * **HidePassword** (*boolean*) -- Allows you to retrieve the
        connection metadata without returning the password. For
        instance, the Glue console uses this flag to retrieve the
        connection, and does not display the password. Set this
        parameter when the caller might not have permission to use the
        KMS key to decrypt the password, but it does have permission
        to access the rest of the connection properties.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

      * **MaxResults** (*integer*) -- The maximum number of
        connections to return in one response.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ConnectionList': [
                 {
                     'Name': 'string',
                     'Description': 'string',
                     'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
                     'MatchCriteria': [
                         'string',
                     ],
                     'ConnectionProperties': {
                         'string': 'string'
                     },
                     'SparkProperties': {
                         'string': 'string'
                     },
                     'AthenaProperties': {
                         'string': 'string'
                     },
                     'PythonProperties': {
                         'string': 'string'
                     },
                     'PhysicalConnectionRequirements': {
                         'SubnetId': 'string',
                         'SecurityGroupIdList': [
                             'string',
                         ],
                         'AvailabilityZone': 'string'
                     },
                     'CreationTime': datetime(2015, 1, 1),
                     'LastUpdatedTime': datetime(2015, 1, 1),
                     'LastUpdatedBy': 'string',
                     'Status': 'READY'|'IN_PROGRESS'|'FAILED',
                     'StatusReason': 'string',
                     'LastConnectionValidationTime': datetime(2015, 1, 1),
                     'AuthenticationConfiguration': {
                         'AuthenticationType': 'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                         'SecretArn': 'string',
                         'OAuth2Properties': {
                             'OAuth2GrantType': 'AUTHORIZATION_CODE'|'CLIENT_CREDENTIALS'|'JWT_BEARER',
                             'OAuth2ClientApplication': {
                                 'UserManagedClientApplicationClientId': 'string',
                                 'AWSManagedClientApplicationReference': 'string'
                             },
                             'TokenUrl': 'string',
                             'TokenUrlParametersMap': {
                                 'string': 'string'
                             }
                         }
                     },
                     'ConnectionSchemaVersion': 123,
                     'CompatibleComputeEnvironments': [
                         'SPARK'|'ATHENA'|'PYTHON',
                     ]
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **ConnectionList** *(list) --*

          A list of requested connection definitions.

          * *(dict) --*

            Defines a connection to a data source.

            * **Name** *(string) --*

              The name of the connection definition.

            * **Description** *(string) --*

              The description of the connection.

            * **ConnectionType** *(string) --*

              The type of the connection. Currently, SFTP is not
              supported.

            * **MatchCriteria** *(list) --*

              A list of criteria that can be used in selecting this
              connection.

              * *(string) --*

            * **ConnectionProperties** *(dict) --*

              These key-value pairs define parameters for the
              connection when using the version 1 Connection schema:

              * "HOST" - The host URI: either the fully qualified
                domain name (FQDN) or the IPv4 address of the database
                host.

              * "PORT" - The port number, between 1024 and 65535, of
                the port on which the database host is listening for
                database connections.

              * "USER_NAME" - The name under which to log in to the
                database. The value string for "USER_NAME" is "
                "USERNAME"".

              * "PASSWORD" - A password, if one is used, for the user
                name.

              * "ENCRYPTED_PASSWORD" - When you enable connection
                password protection by setting
                "ConnectionPasswordEncryption" in the Data Catalog
                encryption settings, this field stores the encrypted
                password.

              * "JDBC_DRIVER_JAR_URI" - The Amazon Simple Storage
                Service (Amazon S3) path of the JAR file that contains
                the JDBC driver to use.

              * "JDBC_DRIVER_CLASS_NAME" - The class name of the JDBC
                driver to use.

              * "JDBC_ENGINE" - The name of the JDBC engine to use.

              * "JDBC_ENGINE_VERSION" - The version of the JDBC engine
                to use.

              * "CONFIG_FILES" - (Reserved for future use.)

              * "INSTANCE_ID" - The instance ID to use.

              * "JDBC_CONNECTION_URL" - The URL for connecting to a
                JDBC data source.

              * "JDBC_ENFORCE_SSL" - A Boolean string (true, false)
                specifying whether Secure Sockets Layer (SSL) with
                hostname matching is enforced for the JDBC connection
                on the client. The default is false.

              * "CUSTOM_JDBC_CERT" - An Amazon S3 location specifying
                the customer's root certificate. Glue uses this root
                certificate to validate the customer’s certificate
                when connecting to the customer database. Glue only
                handles X.509 certificates. The certificate provided
                must be DER-encoded and supplied in Base64 encoding
                PEM format.

              * "SKIP_CUSTOM_JDBC_CERT_VALIDATION" - By default, this
                is "false". Glue validates the Signature algorithm and
                Subject Public Key Algorithm for the customer
                certificate. The only permitted algorithms for the
                Signature algorithm are SHA256withRSA, SHA384withRSA
                or SHA512withRSA. For the Subject Public Key
                Algorithm, the key length must be at least 2048. You
                can set the value of this property to "true" to skip
                Glue’s validation of the customer certificate.

              * "CUSTOM_JDBC_CERT_STRING" - A custom JDBC certificate
                string which is used for domain match or distinguished
                name match to prevent a man-in-the-middle attack. In
                Oracle database, this is used as the
                "SSL_SERVER_CERT_DN"; in Microsoft SQL Server, this is
                used as the "hostNameInCertificate".

              * "CONNECTION_URL" - The URL for connecting to a general
                (non-JDBC) data source.

              * "SECRET_ID" - The secret ID used for the secret
                manager of credentials.

              * "CONNECTOR_URL" - The connector URL for a MARKETPLACE
                or CUSTOM connection.

              * "CONNECTOR_TYPE" - The connector type for a
                MARKETPLACE or CUSTOM connection.

              * "CONNECTOR_CLASS_NAME" - The connector class name for
                a MARKETPLACE or CUSTOM connection.

              * "KAFKA_BOOTSTRAP_SERVERS" - A comma-separated list of
                host and port pairs that are the addresses of the
                Apache Kafka brokers in a Kafka cluster to which a
                Kafka client will connect to and bootstrap itself.

              * "KAFKA_SSL_ENABLED" - Whether to enable or disable SSL
                on an Apache Kafka connection. Default value is
                "true".

              * "KAFKA_CUSTOM_CERT" - The Amazon S3 URL for the
                private CA cert file (.pem format). The default is an
                empty string.

              * "KAFKA_SKIP_CUSTOM_CERT_VALIDATION" - Whether to skip
                the validation of the CA cert file or not. Glue
                validates for three algorithms: SHA256withRSA,
                SHA384withRSA and SHA512withRSA. Default value is
                "false".

              * "KAFKA_CLIENT_KEYSTORE" - The Amazon S3 location of
                the client keystore file for Kafka client side
                authentication (Optional).

              * "KAFKA_CLIENT_KEYSTORE_PASSWORD" - The password to
                access the provided keystore (Optional).

              * "KAFKA_CLIENT_KEY_PASSWORD" - A keystore can consist
                of multiple keys, so this is the password to access
                the client key to be used with the Kafka server side
                key (Optional).

              * "ENCRYPTED_KAFKA_CLIENT_KEYSTORE_PASSWORD" - The
                encrypted version of the Kafka client keystore
                password (if the user has the Glue encrypt passwords
                setting selected).

              * "ENCRYPTED_KAFKA_CLIENT_KEY_PASSWORD" - The encrypted
                version of the Kafka client key password (if the user
                has the Glue encrypt passwords setting selected).

              * "KAFKA_SASL_MECHANISM" - ""SCRAM-SHA-512"",
                ""GSSAPI"", ""AWS_MSK_IAM"", or ""PLAIN"". These are
                the supported SASL Mechanisms.

              * "KAFKA_SASL_PLAIN_USERNAME" - A plaintext username
                used to authenticate with the "PLAIN" mechanism.

              * "KAFKA_SASL_PLAIN_PASSWORD" - A plaintext password
                used to authenticate with the "PLAIN" mechanism.

              * "ENCRYPTED_KAFKA_SASL_PLAIN_PASSWORD" - The encrypted
                version of the Kafka SASL PLAIN password (if the user
                has the Glue encrypt passwords setting selected).

              * "KAFKA_SASL_SCRAM_USERNAME" - A plaintext username
                used to authenticate with the "SCRAM-SHA-512"
                mechanism.

              * "KAFKA_SASL_SCRAM_PASSWORD" - A plaintext password
                used to authenticate with the "SCRAM-SHA-512"
                mechanism.

              * "ENCRYPTED_KAFKA_SASL_SCRAM_PASSWORD" - The encrypted
                version of the Kafka SASL SCRAM password (if the user
                has the Glue encrypt passwords setting selected).

              * "KAFKA_SASL_SCRAM_SECRETS_ARN" - The Amazon Resource
                Name of a secret in Amazon Web Services Secrets
                Manager.

              * "KAFKA_SASL_GSSAPI_KEYTAB" - The S3 location of a
                Kerberos "keytab" file. A keytab stores long-term keys
                for one or more principals. For more information, see
                MIT Kerberos Documentation: Keytab.

              * "KAFKA_SASL_GSSAPI_KRB5_CONF" - The S3 location of a
                Kerberos "krb5.conf" file. A krb5.conf stores Kerberos
                configuration information, such as the location of the
                KDC server. For more information, see MIT Kerberos
                Documentation: krb5.conf.

              * "KAFKA_SASL_GSSAPI_SERVICE" - The Kerberos service
                name, as set with "sasl.kerberos.service.name" in your
                Kafka Configuration.

              * "KAFKA_SASL_GSSAPI_PRINCIPAL" - The name of the
                Kerberos princial used by Glue. For more information,
                see Kafka Documentation: Configuring Kafka Brokers.

              * "ROLE_ARN" - The role to be used for running queries.

              * "REGION" - The Amazon Web Services Region where
                queries will be run.

              * "WORKGROUP_NAME" - The name of an Amazon Redshift
                serverless workgroup or Amazon Athena workgroup in
                which queries will run.

              * "CLUSTER_IDENTIFIER" - The cluster identifier of an
                Amazon Redshift cluster in which queries will run.

              * "DATABASE" - The Amazon Redshift database that you are
                connecting to.

              * *(string) --*

                * *(string) --*

            * **SparkProperties** *(dict) --*

              Connection properties specific to the Spark compute
              environment.

              * *(string) --*

                * *(string) --*

            * **AthenaProperties** *(dict) --*

              Connection properties specific to the Athena compute
              environment.

              * *(string) --*

                * *(string) --*

            * **PythonProperties** *(dict) --*

              Connection properties specific to the Python compute
              environment.

              * *(string) --*

                * *(string) --*

            * **PhysicalConnectionRequirements** *(dict) --*

              The physical connection requirements, such as virtual
              private cloud (VPC) and "SecurityGroup", that are needed
              to make this connection successfully.

              * **SubnetId** *(string) --*

                The subnet ID used by the connection.

              * **SecurityGroupIdList** *(list) --*

                The security group ID list used by the connection.

                * *(string) --*

              * **AvailabilityZone** *(string) --*

                The connection's Availability Zone.

            * **CreationTime** *(datetime) --*

              The timestamp of the time that this connection
              definition was created.

            * **LastUpdatedTime** *(datetime) --*

              The timestamp of the last time the connection definition
              was updated.

            * **LastUpdatedBy** *(string) --*

              The user, group, or role that last updated this
              connection definition.

            * **Status** *(string) --*

              The status of the connection. Can be one of: "READY",
              "IN_PROGRESS", or "FAILED".

            * **StatusReason** *(string) --*

              The reason for the connection status.

            * **LastConnectionValidationTime** *(datetime) --*

              A timestamp of the time this connection was last
              validated.

            * **AuthenticationConfiguration** *(dict) --*

              The authentication properties of the connection.

              * **AuthenticationType** *(string) --*

                A structure containing the authentication
                configuration.

              * **SecretArn** *(string) --*

                The secret manager ARN to store credentials.

              * **OAuth2Properties** *(dict) --*

                The properties for OAuth2 authentication.

                * **OAuth2GrantType** *(string) --*

                  The OAuth2 grant type. For example,
                  "AUTHORIZATION_CODE", "JWT_BEARER", or
                  "CLIENT_CREDENTIALS".

                * **OAuth2ClientApplication** *(dict) --*

                  The client application type. For example,
                  AWS_MANAGED or USER_MANAGED.

                  * **UserManagedClientApplicationClientId** *(string)
                    --*

                    The client application clientID if the
                    ClientAppType is "USER_MANAGED".

                  * **AWSManagedClientApplicationReference** *(string)
                    --*

                    The reference to the SaaS-side client app that is
                    Amazon Web Services managed.

                * **TokenUrl** *(string) --*

                  The URL of the provider's authentication server, to
                  exchange an authorization code for an access token.

                * **TokenUrlParametersMap** *(dict) --*

                  A map of parameters that are added to the token
                  "GET" request.

                  * *(string) --*

                    * *(string) --*

            * **ConnectionSchemaVersion** *(integer) --*

              The version of the connection schema for this
              connection. Version 2 supports properties for specific
              compute environments.

            * **CompatibleComputeEnvironments** *(list) --*

              A list of compute environments compatible with the
              connection.

              * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if the list of connections returned
          does not include the last of the filtered connections.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_classifier


get_classifier
**************

Glue.Client.get_classifier(**kwargs)

   Retrieve a classifier by name.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_classifier(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      Name of the classifier to retrieve.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Classifier': {
                 'GrokClassifier': {
                     'Name': 'string',
                     'Classification': 'string',
                     'CreationTime': datetime(2015, 1, 1),
                     'LastUpdated': datetime(2015, 1, 1),
                     'Version': 123,
                     'GrokPattern': 'string',
                     'CustomPatterns': 'string'
                 },
                 'XMLClassifier': {
                     'Name': 'string',
                     'Classification': 'string',
                     'CreationTime': datetime(2015, 1, 1),
                     'LastUpdated': datetime(2015, 1, 1),
                     'Version': 123,
                     'RowTag': 'string'
                 },
                 'JsonClassifier': {
                     'Name': 'string',
                     'CreationTime': datetime(2015, 1, 1),
                     'LastUpdated': datetime(2015, 1, 1),
                     'Version': 123,
                     'JsonPath': 'string'
                 },
                 'CsvClassifier': {
                     'Name': 'string',
                     'CreationTime': datetime(2015, 1, 1),
                     'LastUpdated': datetime(2015, 1, 1),
                     'Version': 123,
                     'Delimiter': 'string',
                     'QuoteSymbol': 'string',
                     'ContainsHeader': 'UNKNOWN'|'PRESENT'|'ABSENT',
                     'Header': [
                         'string',
                     ],
                     'DisableValueTrimming': True|False,
                     'AllowSingleColumn': True|False,
                     'CustomDatatypeConfigured': True|False,
                     'CustomDatatypes': [
                         'string',
                     ],
                     'Serde': 'OpenCSVSerDe'|'LazySimpleSerDe'|'None'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Classifier** *(dict) --*

          The requested classifier.

          * **GrokClassifier** *(dict) --*

            A classifier that uses "grok".

            * **Name** *(string) --*

              The name of the classifier.

            * **Classification** *(string) --*

              An identifier of the data format that the classifier
              matches, such as Twitter, JSON, Omniture logs, and so
              on.

            * **CreationTime** *(datetime) --*

              The time that this classifier was registered.

            * **LastUpdated** *(datetime) --*

              The time that this classifier was last updated.

            * **Version** *(integer) --*

              The version of this classifier.

            * **GrokPattern** *(string) --*

              The grok pattern applied to a data store by this
              classifier. For more information, see built-in patterns
              in Writing Custom Classifiers.

            * **CustomPatterns** *(string) --*

              Optional custom grok patterns defined by this
              classifier. For more information, see custom patterns in
              Writing Custom Classifiers.

          * **XMLClassifier** *(dict) --*

            A classifier for XML content.

            * **Name** *(string) --*

              The name of the classifier.

            * **Classification** *(string) --*

              An identifier of the data format that the classifier
              matches.

            * **CreationTime** *(datetime) --*

              The time that this classifier was registered.

            * **LastUpdated** *(datetime) --*

              The time that this classifier was last updated.

            * **Version** *(integer) --*

              The version of this classifier.

            * **RowTag** *(string) --*

              The XML tag designating the element that contains each
              record in an XML document being parsed. This can't
              identify a self-closing element (closed by "/>"). An
              empty row element that contains only attributes can be
              parsed as long as it ends with a closing tag (for
              example, "<row item_a="A" item_b="B"></row>" is okay,
              but "<row item_a="A" item_b="B" />" is not).

          * **JsonClassifier** *(dict) --*

            A classifier for JSON content.

            * **Name** *(string) --*

              The name of the classifier.

            * **CreationTime** *(datetime) --*

              The time that this classifier was registered.

            * **LastUpdated** *(datetime) --*

              The time that this classifier was last updated.

            * **Version** *(integer) --*

              The version of this classifier.

            * **JsonPath** *(string) --*

              A "JsonPath" string defining the JSON data for the
              classifier to classify. Glue supports a subset of
              JsonPath, as described in Writing JsonPath Custom
              Classifiers.

          * **CsvClassifier** *(dict) --*

            A classifier for comma-separated values (CSV).

            * **Name** *(string) --*

              The name of the classifier.

            * **CreationTime** *(datetime) --*

              The time that this classifier was registered.

            * **LastUpdated** *(datetime) --*

              The time that this classifier was last updated.

            * **Version** *(integer) --*

              The version of this classifier.

            * **Delimiter** *(string) --*

              A custom symbol to denote what separates each column
              entry in the row.

            * **QuoteSymbol** *(string) --*

              A custom symbol to denote what combines content into a
              single column value. It must be different from the
              column delimiter.

            * **ContainsHeader** *(string) --*

              Indicates whether the CSV file contains a header.

            * **Header** *(list) --*

              A list of strings representing column names.

              * *(string) --*

            * **DisableValueTrimming** *(boolean) --*

              Specifies not to trim values before identifying the type
              of column values. The default value is "true".

            * **AllowSingleColumn** *(boolean) --*

              Enables the processing of files that contain only one
              column.

            * **CustomDatatypeConfigured** *(boolean) --*

              Enables the custom datatype to be configured.

            * **CustomDatatypes** *(list) --*

              A list of custom datatypes including "BINARY",
              "BOOLEAN", "DATE", "DECIMAL", "DOUBLE", "FLOAT", "INT",
              "LONG", "SHORT", "STRING", "TIMESTAMP".

              * *(string) --*

            * **Serde** *(string) --*

              Sets the SerDe for processing CSV in the classifier,
              which will be applied in the Data Catalog. Valid values
              are "OpenCSVSerDe", "LazySimpleSerDe", and "None". You
              can specify the "None" value when you want the crawler
              to do the detection.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_unfiltered_partitions_metadata


get_unfiltered_partitions_metadata
**********************************

Glue.Client.get_unfiltered_partitions_metadata(**kwargs)

   Retrieves partition metadata from the Data Catalog that contains
   unfiltered metadata.

   For IAM authorization, the public IAM action associated with this
   API is "glue:GetPartitions".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_unfiltered_partitions_metadata(
          Region='string',
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          Expression='string',
          AuditContext={
              'AdditionalAuditContext': 'string',
              'RequestedColumns': [
                  'string',
              ],
              'AllColumnsRequested': True|False
          },
          SupportedPermissionTypes=[
              'COLUMN_PERMISSION'|'CELL_FILTER_PERMISSION'|'NESTED_PERMISSION'|'NESTED_CELL_PERMISSION',
          ],
          NextToken='string',
          Segment={
              'SegmentNumber': 123,
              'TotalSegments': 123
          },
          MaxResults=123,
          QuerySessionContext={
              'QueryId': 'string',
              'QueryStartTime': datetime(2015, 1, 1),
              'ClusterId': 'string',
              'QueryAuthorizationId': 'string',
              'AdditionalContext': {
                  'string': 'string'
              }
          }
      )

   Parameters:
      * **Region** (*string*) -- Specified only if the base tables
        belong to a different Amazon Web Services Region.

      * **CatalogId** (*string*) --

        **[REQUIRED]**

        The ID of the Data Catalog where the partitions in question
        reside. If none is provided, the AWS account ID is used by
        default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partitions reside.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table that contains the partition.

      * **Expression** (*string*) --

        An expression that filters the partitions to be returned.

        The expression uses SQL syntax similar to the SQL "WHERE"
        filter clause. The SQL statement parser JSQLParser parses the
        expression.

        *Operators*: The following are the operators that you can use
        in the "Expression" API call:

           =

        Checks whether the values of the two operands are equal; if
        yes, then the condition becomes true.

        Example: Assume 'variable a' holds 10 and 'variable b' holds
        20.

        (a = b) is not true.

           < >

        Checks whether the values of two operands are equal; if the
        values are not equal, then the condition becomes true.

        Example: (a < > b) is true.

           >

        Checks whether the value of the left operand is greater than
        the value of the right operand; if yes, then the condition
        becomes true.

        Example: (a > b) is not true.

           <

        Checks whether the value of the left operand is less than the
        value of the right operand; if yes, then the condition becomes
        true.

        Example: (a < b) is true.

           >=

        Checks whether the value of the left operand is greater than
        or equal to the value of the right operand; if yes, then the
        condition becomes true.

        Example: (a >= b) is not true.

           <=

        Checks whether the value of the left operand is less than or
        equal to the value of the right operand; if yes, then the
        condition becomes true.

        Example: (a <= b) is true.

           AND, OR, IN, BETWEEN, LIKE, NOT, IS NULL

        Logical operators.

        *Supported Partition Key Types*: The following are the
        supported partition keys.

        * "string"

        * "date"

        * "timestamp"

        * "int"

        * "bigint"

        * "long"

        * "tinyint"

        * "smallint"

        * "decimal"

        If an type is encountered that is not valid, an exception is
        thrown.

      * **AuditContext** (*dict*) --

        A structure containing Lake Formation audit context
        information.

        * **AdditionalAuditContext** *(string) --*

          A string containing the additional audit context
          information.

        * **RequestedColumns** *(list) --*

          The requested columns for audit.

          * *(string) --*

        * **AllColumnsRequested** *(boolean) --*

          All columns request for audit.

      * **SupportedPermissionTypes** (*list*) --

        **[REQUIRED]**

        A list of supported permission types.

        * *(string) --*

      * **NextToken** (*string*) -- A continuation token, if this is
        not the first call to retrieve these partitions.

      * **Segment** (*dict*) --

        The segment of the table's partitions to scan in this request.

        * **SegmentNumber** *(integer) --* **[REQUIRED]**

          The zero-based index number of the segment. For example, if
          the total number of segments is 4, "SegmentNumber" values
          range from 0 through 3.

        * **TotalSegments** *(integer) --* **[REQUIRED]**

          The total number of segments.

      * **MaxResults** (*integer*) -- The maximum number of partitions
        to return in a single response.

      * **QuerySessionContext** (*dict*) --

        A structure used as a protocol between query engines and Lake
        Formation or Glue. Contains both a Lake Formation generated
        authorization identifier and information from the request's
        authorization context.

        * **QueryId** *(string) --*

          A unique identifier generated by the query engine for the
          query.

        * **QueryStartTime** *(datetime) --*

          A timestamp provided by the query engine for when the query
          started.

        * **ClusterId** *(string) --*

          An identifier string for the consumer cluster.

        * **QueryAuthorizationId** *(string) --*

          A cryptographically generated query identifier generated by
          Glue or Lake Formation.

        * **AdditionalContext** *(dict) --*

          An opaque string-string map passed by the query engine.

          * *(string) --*

            * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'UnfilteredPartitions': [
                 {
                     'Partition': {
                         'Values': [
                             'string',
                         ],
                         'DatabaseName': 'string',
                         'TableName': 'string',
                         'CreationTime': datetime(2015, 1, 1),
                         'LastAccessTime': datetime(2015, 1, 1),
                         'StorageDescriptor': {
                             'Columns': [
                                 {
                                     'Name': 'string',
                                     'Type': 'string',
                                     'Comment': 'string',
                                     'Parameters': {
                                         'string': 'string'
                                     }
                                 },
                             ],
                             'Location': 'string',
                             'AdditionalLocations': [
                                 'string',
                             ],
                             'InputFormat': 'string',
                             'OutputFormat': 'string',
                             'Compressed': True|False,
                             'NumberOfBuckets': 123,
                             'SerdeInfo': {
                                 'Name': 'string',
                                 'SerializationLibrary': 'string',
                                 'Parameters': {
                                     'string': 'string'
                                 }
                             },
                             'BucketColumns': [
                                 'string',
                             ],
                             'SortColumns': [
                                 {
                                     'Column': 'string',
                                     'SortOrder': 123
                                 },
                             ],
                             'Parameters': {
                                 'string': 'string'
                             },
                             'SkewedInfo': {
                                 'SkewedColumnNames': [
                                     'string',
                                 ],
                                 'SkewedColumnValues': [
                                     'string',
                                 ],
                                 'SkewedColumnValueLocationMaps': {
                                     'string': 'string'
                                 }
                             },
                             'StoredAsSubDirectories': True|False,
                             'SchemaReference': {
                                 'SchemaId': {
                                     'SchemaArn': 'string',
                                     'SchemaName': 'string',
                                     'RegistryName': 'string'
                                 },
                                 'SchemaVersionId': 'string',
                                 'SchemaVersionNumber': 123
                             }
                         },
                         'Parameters': {
                             'string': 'string'
                         },
                         'LastAnalyzedTime': datetime(2015, 1, 1),
                         'CatalogId': 'string'
                     },
                     'AuthorizedColumns': [
                         'string',
                     ],
                     'IsRegisteredWithLakeFormation': True|False
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **UnfilteredPartitions** *(list) --*

          A list of requested partitions.

          * *(dict) --*

            A partition that contains unfiltered metadata.

            * **Partition** *(dict) --*

              The partition object.

              * **Values** *(list) --*

                The values of the partition.

                * *(string) --*

              * **DatabaseName** *(string) --*

                The name of the catalog database in which to create
                the partition.

              * **TableName** *(string) --*

                The name of the database table in which to create the
                partition.

              * **CreationTime** *(datetime) --*

                The time at which the partition was created.

              * **LastAccessTime** *(datetime) --*

                The last time at which the partition was accessed.

              * **StorageDescriptor** *(dict) --*

                Provides information about the physical location where
                the partition is stored.

                * **Columns** *(list) --*

                  A list of the "Columns" in the table.

                  * *(dict) --*

                    A column in a "Table".

                    * **Name** *(string) --*

                      The name of the "Column".

                    * **Type** *(string) --*

                      The data type of the "Column".

                    * **Comment** *(string) --*

                      A free-form text comment.

                    * **Parameters** *(dict) --*

                      These key-value pairs define properties
                      associated with the column.

                      * *(string) --*

                        * *(string) --*

                * **Location** *(string) --*

                  The physical location of the table. By default, this
                  takes the form of the warehouse location, followed
                  by the database location in the warehouse, followed
                  by the table name.

                * **AdditionalLocations** *(list) --*

                  A list of locations that point to the path where a
                  Delta table is located.

                  * *(string) --*

                * **InputFormat** *(string) --*

                  The input format: "SequenceFileInputFormat"
                  (binary), or "TextInputFormat", or a custom format.

                * **OutputFormat** *(string) --*

                  The output format: "SequenceFileOutputFormat"
                  (binary), or "IgnoreKeyTextOutputFormat", or a
                  custom format.

                * **Compressed** *(boolean) --*

                  "True" if the data in the table is compressed, or
                  "False" if not.

                * **NumberOfBuckets** *(integer) --*

                  Must be specified if the table contains any
                  dimension columns.

                * **SerdeInfo** *(dict) --*

                  The serialization/deserialization (SerDe)
                  information.

                  * **Name** *(string) --*

                    Name of the SerDe.

                  * **SerializationLibrary** *(string) --*

                    Usually the class that implements the SerDe. An
                    example is "org.apache.hadoop.hive.serde2.columna
                    r.ColumnarSerDe".

                  * **Parameters** *(dict) --*

                    These key-value pairs define initialization
                    parameters for the SerDe.

                    * *(string) --*

                      * *(string) --*

                * **BucketColumns** *(list) --*

                  A list of reducer grouping columns, clustering
                  columns, and bucketing columns in the table.

                  * *(string) --*

                * **SortColumns** *(list) --*

                  A list specifying the sort order of each bucket in
                  the table.

                  * *(dict) --*

                    Specifies the sort order of a sorted column.

                    * **Column** *(string) --*

                      The name of the column.

                    * **SortOrder** *(integer) --*

                      Indicates that the column is sorted in ascending
                      order ( "== 1"), or in descending order (
                      "==0").

                * **Parameters** *(dict) --*

                  The user-supplied properties in key-value form.

                  * *(string) --*

                    * *(string) --*

                * **SkewedInfo** *(dict) --*

                  The information about values that appear frequently
                  in a column (skewed values).

                  * **SkewedColumnNames** *(list) --*

                    A list of names of columns that contain skewed
                    values.

                    * *(string) --*

                  * **SkewedColumnValues** *(list) --*

                    A list of values that appear so frequently as to
                    be considered skewed.

                    * *(string) --*

                  * **SkewedColumnValueLocationMaps** *(dict) --*

                    A mapping of skewed values to the columns that
                    contain them.

                    * *(string) --*

                      * *(string) --*

                * **StoredAsSubDirectories** *(boolean) --*

                  "True" if the table data is stored in
                  subdirectories, or "False" if not.

                * **SchemaReference** *(dict) --*

                  An object that references a schema stored in the
                  Glue Schema Registry.

                  When creating a table, you can pass an empty list of
                  columns for the schema, and instead use a schema
                  reference.

                  * **SchemaId** *(dict) --*

                    A structure that contains schema identity fields.
                    Either this or the "SchemaVersionId" has to be
                    provided.

                    * **SchemaArn** *(string) --*

                      The Amazon Resource Name (ARN) of the schema.
                      One of "SchemaArn" or "SchemaName" has to be
                      provided.

                    * **SchemaName** *(string) --*

                      The name of the schema. One of "SchemaArn" or
                      "SchemaName" has to be provided.

                    * **RegistryName** *(string) --*

                      The name of the schema registry that contains
                      the schema.

                  * **SchemaVersionId** *(string) --*

                    The unique ID assigned to a version of the schema.
                    Either this or the "SchemaId" has to be provided.

                  * **SchemaVersionNumber** *(integer) --*

                    The version number of the schema.

              * **Parameters** *(dict) --*

                These key-value pairs define partition parameters.

                * *(string) --*

                  * *(string) --*

              * **LastAnalyzedTime** *(datetime) --*

                The last time at which column statistics were computed
                for this partition.

              * **CatalogId** *(string) --*

                The ID of the Data Catalog in which the partition
                resides.

            * **AuthorizedColumns** *(list) --*

              The list of columns the user has permissions to access.

              * *(string) --*

            * **IsRegisteredWithLakeFormation** *(boolean) --*

              A Boolean value indicating that the partition location
              is registered with Lake Formation.

        * **NextToken** *(string) --*

          A continuation token, if the returned list of partitions
          does not include the last one.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.PermissionTypeMismatchException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / create_security_configuration


create_security_configuration
*****************************

Glue.Client.create_security_configuration(**kwargs)

   Creates a new security configuration. A security configuration is a
   set of security properties that can be used by Glue. You can use a
   security configuration to encrypt data at rest. For information
   about using security configurations in Glue, see Encrypting Data
   Written by Crawlers, Jobs, and Development Endpoints.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_security_configuration(
          Name='string',
          EncryptionConfiguration={
              'S3Encryption': [
                  {
                      'S3EncryptionMode': 'DISABLED'|'SSE-KMS'|'SSE-S3',
                      'KmsKeyArn': 'string'
                  },
              ],
              'CloudWatchEncryption': {
                  'CloudWatchEncryptionMode': 'DISABLED'|'SSE-KMS',
                  'KmsKeyArn': 'string'
              },
              'JobBookmarksEncryption': {
                  'JobBookmarksEncryptionMode': 'DISABLED'|'CSE-KMS',
                  'KmsKeyArn': 'string'
              },
              'DataQualityEncryption': {
                  'DataQualityEncryptionMode': 'DISABLED'|'SSE-KMS',
                  'KmsKeyArn': 'string'
              }
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name for the new security configuration.

      * **EncryptionConfiguration** (*dict*) --

        **[REQUIRED]**

        The encryption configuration for the new security
        configuration.

        * **S3Encryption** *(list) --*

          The encryption configuration for Amazon Simple Storage
          Service (Amazon S3) data.

          * *(dict) --*

            Specifies how Amazon Simple Storage Service (Amazon S3)
            data should be encrypted.

            * **S3EncryptionMode** *(string) --*

              The encryption mode to use for Amazon S3 data.

            * **KmsKeyArn** *(string) --*

              The Amazon Resource Name (ARN) of the KMS key to be used
              to encrypt the data.

        * **CloudWatchEncryption** *(dict) --*

          The encryption configuration for Amazon CloudWatch.

          * **CloudWatchEncryptionMode** *(string) --*

            The encryption mode to use for CloudWatch data.

          * **KmsKeyArn** *(string) --*

            The Amazon Resource Name (ARN) of the KMS key to be used
            to encrypt the data.

        * **JobBookmarksEncryption** *(dict) --*

          The encryption configuration for job bookmarks.

          * **JobBookmarksEncryptionMode** *(string) --*

            The encryption mode to use for job bookmarks data.

          * **KmsKeyArn** *(string) --*

            The Amazon Resource Name (ARN) of the KMS key to be used
            to encrypt the data.

        * **DataQualityEncryption** *(dict) --*

          The encryption configuration for Glue Data Quality assets.

          * **DataQualityEncryptionMode** *(string) --*

            The encryption mode to use for encrypting Data Quality
            assets. These assets include data quality rulesets,
            results, statistics, anomaly detection models and
            observations.

            Valid values are "SSEKMS" for encryption using a customer-
            managed KMS key, or "DISABLED".

          * **KmsKeyArn** *(string) --*

            The Amazon Resource Name (ARN) of the KMS key to be used
            to encrypt the data.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string',
             'CreatedTimestamp': datetime(2015, 1, 1)
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name assigned to the new security configuration.

        * **CreatedTimestamp** *(datetime) --*

          The time at which the new security configuration was
          created.

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / batch_get_jobs


batch_get_jobs
**************

Glue.Client.batch_get_jobs(**kwargs)

   Returns a list of resource metadata for a given list of job names.
   After calling the "ListJobs" operation, you can call this operation
   to access the data to which you have been granted permissions. This
   operation supports all IAM permissions, including permission
   conditions that uses tags.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_get_jobs(
          JobNames=[
              'string',
          ]
      )

   Parameters:
      **JobNames** (*list*) --

      **[REQUIRED]**

      A list of job names, which might be the names returned from the
      "ListJobs" operation.

      * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

      **Response Structure**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / put_workflow_run_properties


put_workflow_run_properties
***************************

Glue.Client.put_workflow_run_properties(**kwargs)

   Puts the specified workflow run properties for the given workflow
   run. If a property already exists for the specified run, then it
   overrides the value otherwise adds the property to existing
   properties.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.put_workflow_run_properties(
          Name='string',
          RunId='string',
          RunProperties={
              'string': 'string'
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        Name of the workflow which was run.

      * **RunId** (*string*) --

        **[REQUIRED]**

        The ID of the workflow run for which the run properties should
        be updated.

      * **RunProperties** (*dict*) --

        **[REQUIRED]**

        The properties to put for the specified run.

        Run properties may be logged. Do not pass plaintext secrets as
        properties. Retrieve secrets from a Glue Connection, Amazon
        Web Services Secrets Manager or other secret management
        mechanism if you intend to use them within the workflow run.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.ConcurrentModificationException"
Glue / Client / create_integration_resource_property


create_integration_resource_property
************************************

Glue.Client.create_integration_resource_property(**kwargs)

   This API can be used for setting up the "ResourceProperty" of the
   Glue connection (for the source) or Glue database ARN (for the
   target). These properties can include the role to access the
   connection or database. To set both source and target properties
   the same API needs to be invoked with the Glue connection ARN as
   "ResourceArn" with "SourceProcessingProperties" and the Glue
   database ARN as "ResourceArn" with "TargetProcessingProperties"
   respectively.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_integration_resource_property(
          ResourceArn='string',
          SourceProcessingProperties={
              'RoleArn': 'string'
          },
          TargetProcessingProperties={
              'RoleArn': 'string',
              'KmsArn': 'string',
              'ConnectionName': 'string',
              'EventBusArn': 'string'
          }
      )

   Parameters:
      * **ResourceArn** (*string*) --

        **[REQUIRED]**

        The connection ARN of the source, or the database ARN of the
        target.

      * **SourceProcessingProperties** (*dict*) --

        The resource properties associated with the integration
        source.

        * **RoleArn** *(string) --*

          The IAM role to access the Glue connection.

      * **TargetProcessingProperties** (*dict*) --

        The resource properties associated with the integration
        target.

        * **RoleArn** *(string) --*

          The IAM role to access the Glue database.

        * **KmsArn** *(string) --*

          The ARN of the KMS key used for encryption.

        * **ConnectionName** *(string) --*

          The Glue network connection to configure the Glue job
          running in the customer VPC.

        * **EventBusArn** *(string) --*

          The ARN of an Eventbridge event bus to receive the
          integration status notification.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ResourceArn': 'string',
             'SourceProcessingProperties': {
                 'RoleArn': 'string'
             },
             'TargetProcessingProperties': {
                 'RoleArn': 'string',
                 'KmsArn': 'string',
                 'ConnectionName': 'string',
                 'EventBusArn': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **ResourceArn** *(string) --*

          The connection ARN of the source, or the database ARN of the
          target.

        * **SourceProcessingProperties** *(dict) --*

          The resource properties associated with the integration
          source.

          * **RoleArn** *(string) --*

            The IAM role to access the Glue connection.

        * **TargetProcessingProperties** *(dict) --*

          The resource properties associated with the integration
          target.

          * **RoleArn** *(string) --*

            The IAM role to access the Glue database.

          * **KmsArn** *(string) --*

            The ARN of the KMS key used for encryption.

          * **ConnectionName** *(string) --*

            The Glue network connection to configure the Glue job
            running in the customer VPC.

          * **EventBusArn** *(string) --*

            The ARN of an Eventbridge event bus to receive the
            integration status notification.

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ConflictException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.ResourceNotFoundException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / get_schema_versions_diff


get_schema_versions_diff
************************

Glue.Client.get_schema_versions_diff(**kwargs)

   Fetches the schema version difference in the specified difference
   type between two stored schema versions in the Schema Registry.

   This API allows you to compare two schema versions between two
   schema definitions under the same schema.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_schema_versions_diff(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          FirstSchemaVersionNumber={
              'LatestVersion': True|False,
              'VersionNumber': 123
          },
          SecondSchemaVersionNumber={
              'LatestVersion': True|False,
              'VersionNumber': 123
          },
          SchemaDiffType='SYNTAX_DIFF'
      )

   Parameters:
      * **SchemaId** (*dict*) --

        **[REQUIRED]**

        This is a wrapper structure to contain schema identity fields.
        The structure contains:

        * SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the
          schema. One of "SchemaArn" or "SchemaName" has to be
          provided.

        * SchemaId$SchemaName: The name of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **FirstSchemaVersionNumber** (*dict*) --

        **[REQUIRED]**

        The first of the two schema versions to be compared.

        * **LatestVersion** *(boolean) --*

          The latest version available for the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

      * **SecondSchemaVersionNumber** (*dict*) --

        **[REQUIRED]**

        The second of the two schema versions to be compared.

        * **LatestVersion** *(boolean) --*

          The latest version available for the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

      * **SchemaDiffType** (*string*) --

        **[REQUIRED]**

        Refers to "SYNTAX_DIFF", which is the currently supported diff
        type.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Diff': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Diff** *(string) --*

          The difference between schemas as a string in JsonPatch
          format.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / query_schema_version_metadata


query_schema_version_metadata
*****************************

Glue.Client.query_schema_version_metadata(**kwargs)

   Queries for the schema version metadata information.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.query_schema_version_metadata(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          SchemaVersionNumber={
              'LatestVersion': True|False,
              'VersionNumber': 123
          },
          SchemaVersionId='string',
          MetadataList=[
              {
                  'MetadataKey': 'string',
                  'MetadataValue': 'string'
              },
          ],
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **SchemaId** (*dict*) --

        A wrapper structure that may contain the schema name and
        Amazon Resource Name (ARN).

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **SchemaVersionNumber** (*dict*) --

        The version number of the schema.

        * **LatestVersion** *(boolean) --*

          The latest version available for the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

      * **SchemaVersionId** (*string*) -- The unique version ID of the
        schema version.

      * **MetadataList** (*list*) --

        Search key-value pairs for metadata, if they are not provided
        all the metadata information will be fetched.

        * *(dict) --*

          A structure containing a key value pair for metadata.

          * **MetadataKey** *(string) --*

            A metadata key.

          * **MetadataValue** *(string) --*

            A metadata key’s corresponding value.

      * **MaxResults** (*integer*) -- Maximum number of results
        required per page. If the value is not supplied, this will be
        defaulted to 25 per page.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'MetadataInfoMap': {
                 'string': {
                     'MetadataValue': 'string',
                     'CreatedTime': 'string',
                     'OtherMetadataValueList': [
                         {
                             'MetadataValue': 'string',
                             'CreatedTime': 'string'
                         },
                     ]
                 }
             },
             'SchemaVersionId': 'string',
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **MetadataInfoMap** *(dict) --*

          A map of a metadata key and associated values.

          * *(string) --*

            * *(dict) --*

              A structure containing metadata information for a schema
              version.

              * **MetadataValue** *(string) --*

                The metadata key’s corresponding value.

              * **CreatedTime** *(string) --*

                The time at which the entry was created.

              * **OtherMetadataValueList** *(list) --*

                Other metadata belonging to the same metadata key.

                * *(dict) --*

                  A structure containing other metadata for a schema
                  version belonging to the same metadata key.

                  * **MetadataValue** *(string) --*

                    The metadata key’s corresponding value for the
                    other metadata belonging to the same metadata key.

                  * **CreatedTime** *(string) --*

                    The time at which the entry was created.

        * **SchemaVersionId** *(string) --*

          The unique version ID of the schema version.

        * **NextToken** *(string) --*

          A continuation token for paginating the returned list of
          tokens, returned if the current segment of the list is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"
Glue / Client / cancel_statement


cancel_statement
****************

Glue.Client.cancel_statement(**kwargs)

   Cancels the statement.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.cancel_statement(
          SessionId='string',
          Id=123,
          RequestOrigin='string'
      )

   Parameters:
      * **SessionId** (*string*) --

        **[REQUIRED]**

        The Session ID of the statement to be cancelled.

      * **Id** (*integer*) --

        **[REQUIRED]**

        The ID of the statement to be cancelled.

      * **RequestOrigin** (*string*) -- The origin of the request to
        cancel the statement.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.IllegalSessionStateException"
Glue / Client / delete_table_version


delete_table_version
********************

Glue.Client.delete_table_version(**kwargs)

   Deletes a specified version of a table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_table_version(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          VersionId='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the tables reside. If none is provided, the Amazon Web
        Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The database in the catalog in which the table resides. For
        Hive compatibility, this name is entirely lowercase.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table. For Hive compatibility, this name is
        entirely lowercase.

      * **VersionId** (*string*) --

        **[REQUIRED]**

        The ID of the table version to be deleted. A "VersionID" is a
        string representation of an integer. Each version is
        incremented by 1.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / start_column_statistics_task_run


start_column_statistics_task_run
********************************

Glue.Client.start_column_statistics_task_run(**kwargs)

   Starts a column statistics task run, for a specified table and
   columns.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_column_statistics_task_run(
          DatabaseName='string',
          TableName='string',
          ColumnNameList=[
              'string',
          ],
          Role='string',
          SampleSize=123.0,
          CatalogID='string',
          SecurityConfiguration='string'
      )

   Parameters:
      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database where the table resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table to generate statistics.

      * **ColumnNameList** (*list*) --

        A list of the column names to generate statistics. If none is
        supplied, all column names for the table will be used by
        default.

        * *(string) --*

      * **Role** (*string*) --

        **[REQUIRED]**

        The IAM role that the service assumes to generate statistics.

      * **SampleSize** (*float*) -- The percentage of rows used to
        generate statistics. If none is supplied, the entire table
        will be used to generate stats.

      * **CatalogID** (*string*) -- The ID of the Data Catalog where
        the table reside. If none is supplied, the Amazon Web Services
        account ID is used by default.

      * **SecurityConfiguration** (*string*) -- Name of the security
        configuration that is used to encrypt CloudWatch logs for the
        column stats task run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ColumnStatisticsTaskRunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **ColumnStatisticsTaskRunId** *(string) --*

          The identifier for the column statistics task run.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ColumnStatisticsTaskRunningException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / close


close
*****

Glue.Client.close()

   Closes underlying endpoint connections.
Glue / Client / put_resource_policy


put_resource_policy
*******************

Glue.Client.put_resource_policy(**kwargs)

   Sets the Data Catalog resource policy for access control.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.put_resource_policy(
          PolicyInJson='string',
          ResourceArn='string',
          PolicyHashCondition='string',
          PolicyExistsCondition='MUST_EXIST'|'NOT_EXIST'|'NONE',
          EnableHybrid='TRUE'|'FALSE'
      )

   Parameters:
      * **PolicyInJson** (*string*) --

        **[REQUIRED]**

        Contains the policy document to set, in JSON format.

      * **ResourceArn** (*string*) -- Do not use. For internal use
        only.

      * **PolicyHashCondition** (*string*) -- The hash value returned
        when the previous policy was set using "PutResourcePolicy".
        Its purpose is to prevent concurrent modifications of a
        policy. Do not use this parameter if no previous policy has
        been set.

      * **PolicyExistsCondition** (*string*) -- A value of
        "MUST_EXIST" is used to update a policy. A value of
        "NOT_EXIST" is used to create a new policy. If a value of
        "NONE" or a null value is used, the call does not depend on
        the existence of a policy.

      * **EnableHybrid** (*string*) --

        If "'TRUE'", indicates that you are using both methods to
        grant cross-account access to Data Catalog resources:

        * By directly updating the resource policy with
          "PutResourePolicy"

        * By using the **Grant permissions** command on the Amazon Web
          Services Management Console.

        Must be set to "'TRUE'" if you have already used the
        Management Console to grant cross-account access, otherwise
        the call fails. Default is 'FALSE'.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'PolicyHash': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **PolicyHash** *(string) --*

          A hash of the policy that has just been set. This must be
          included in a subsequent call that overwrites or updates
          this policy.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.ConditionCheckFailureException"
Glue / Client / get_resource_policy


get_resource_policy
*******************

Glue.Client.get_resource_policy(**kwargs)

   Retrieves a specified resource policy.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_resource_policy(
          ResourceArn='string'
      )

   Parameters:
      **ResourceArn** (*string*) -- The ARN of the Glue resource for
      which to retrieve the resource policy. If not supplied, the Data
      Catalog resource policy is returned. Use "GetResourcePolicies"
      to view all existing resource policies. For more information see
      Specifying Glue Resource ARNs.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'PolicyInJson': 'string',
             'PolicyHash': 'string',
             'CreateTime': datetime(2015, 1, 1),
             'UpdateTime': datetime(2015, 1, 1)
         }

      **Response Structure**

      * *(dict) --*

        * **PolicyInJson** *(string) --*

          Contains the requested policy document, in JSON format.

        * **PolicyHash** *(string) --*

          Contains the hash value associated with this policy.

        * **CreateTime** *(datetime) --*

          The date and time at which the policy was created.

        * **UpdateTime** *(datetime) --*

          The date and time at which the policy was last updated.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / list_blueprints


list_blueprints
***************

Glue.Client.list_blueprints(**kwargs)

   Lists all the blueprint names in an account.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_blueprints(
          NextToken='string',
          MaxResults=123,
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

      * **Tags** (*dict*) --

        Filters the list by an Amazon Web Services resource tag.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Blueprints': [
                 'string',
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Blueprints** *(list) --*

          List of names of blueprints in the account.

          * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if not all blueprint names have been
          returned.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_data_quality_ruleset_evaluation_run


get_data_quality_ruleset_evaluation_run
***************************************

Glue.Client.get_data_quality_ruleset_evaluation_run(**kwargs)

   Retrieves a specific run where a ruleset is evaluated against a
   data source.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_data_quality_ruleset_evaluation_run(
          RunId='string'
      )

   Parameters:
      **RunId** (*string*) --

      **[REQUIRED]**

      The unique run identifier associated with this run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RunId': 'string',
             'DataSource': {
                 'GlueTable': {
                     'DatabaseName': 'string',
                     'TableName': 'string',
                     'CatalogId': 'string',
                     'ConnectionName': 'string',
                     'AdditionalOptions': {
                         'string': 'string'
                     }
                 }
             },
             'Role': 'string',
             'NumberOfWorkers': 123,
             'Timeout': 123,
             'AdditionalRunOptions': {
                 'CloudWatchMetricsEnabled': True|False,
                 'ResultsS3Prefix': 'string',
                 'CompositeRuleEvaluationMethod': 'COLUMN'|'ROW'
             },
             'Status': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT',
             'ErrorString': 'string',
             'StartedOn': datetime(2015, 1, 1),
             'LastModifiedOn': datetime(2015, 1, 1),
             'CompletedOn': datetime(2015, 1, 1),
             'ExecutionTime': 123,
             'RulesetNames': [
                 'string',
             ],
             'ResultIds': [
                 'string',
             ],
             'AdditionalDataSources': {
                 'string': {
                     'GlueTable': {
                         'DatabaseName': 'string',
                         'TableName': 'string',
                         'CatalogId': 'string',
                         'ConnectionName': 'string',
                         'AdditionalOptions': {
                             'string': 'string'
                         }
                     }
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **RunId** *(string) --*

          The unique run identifier associated with this run.

        * **DataSource** *(dict) --*

          The data source (an Glue table) associated with this
          evaluation run.

          * **GlueTable** *(dict) --*

            An Glue table.

            * **DatabaseName** *(string) --*

              A database name in the Glue Data Catalog.

            * **TableName** *(string) --*

              A table name in the Glue Data Catalog.

            * **CatalogId** *(string) --*

              A unique identifier for the Glue Data Catalog.

            * **ConnectionName** *(string) --*

              The name of the connection to the Glue Data Catalog.

            * **AdditionalOptions** *(dict) --*

              Additional options for the table. Currently there are
              two keys supported:

              * "pushDownPredicate": to filter on partitions without
                having to list and read all the files in your dataset.

              * "catalogPartitionPredicate": to use server-side
                partition pruning using partition indexes in the Glue
                Data Catalog.

              * *(string) --*

                * *(string) --*

        * **Role** *(string) --*

          An IAM role supplied to encrypt the results of the run.

        * **NumberOfWorkers** *(integer) --*

          The number of "G.1X" workers to be used in the run. The
          default is 5.

        * **Timeout** *(integer) --*

          The timeout for a run in minutes. This is the maximum time
          that a run can consume resources before it is terminated and
          enters "TIMEOUT" status. The default is 2,880 minutes (48
          hours).

        * **AdditionalRunOptions** *(dict) --*

          Additional run options you can specify for an evaluation
          run.

          * **CloudWatchMetricsEnabled** *(boolean) --*

            Whether or not to enable CloudWatch metrics.

          * **ResultsS3Prefix** *(string) --*

            Prefix for Amazon S3 to store results.

          * **CompositeRuleEvaluationMethod** *(string) --*

            Set the evaluation method for composite rules in the
            ruleset to ROW/COLUMN

        * **Status** *(string) --*

          The status for this run.

        * **ErrorString** *(string) --*

          The error strings that are associated with the run.

        * **StartedOn** *(datetime) --*

          The date and time when this run started.

        * **LastModifiedOn** *(datetime) --*

          A timestamp. The last point in time when this data quality
          rule recommendation run was modified.

        * **CompletedOn** *(datetime) --*

          The date and time when this run was completed.

        * **ExecutionTime** *(integer) --*

          The amount of time (in seconds) that the run consumed
          resources.

        * **RulesetNames** *(list) --*

          A list of ruleset names for the run. Currently, this
          parameter takes only one Ruleset name.

          * *(string) --*

        * **ResultIds** *(list) --*

          A list of result IDs for the data quality results for the
          run.

          * *(string) --*

        * **AdditionalDataSources** *(dict) --*

          A map of reference strings to additional data sources you
          can specify for an evaluation run.

          * *(string) --*

            * *(dict) --*

              A data source (an Glue table) for which you want data
              quality results.

              * **GlueTable** *(dict) --*

                An Glue table.

                * **DatabaseName** *(string) --*

                  A database name in the Glue Data Catalog.

                * **TableName** *(string) --*

                  A table name in the Glue Data Catalog.

                * **CatalogId** *(string) --*

                  A unique identifier for the Glue Data Catalog.

                * **ConnectionName** *(string) --*

                  The name of the connection to the Glue Data Catalog.

                * **AdditionalOptions** *(dict) --*

                  Additional options for the table. Currently there
                  are two keys supported:

                  * "pushDownPredicate": to filter on partitions
                    without having to list and read all the files in
                    your dataset.

                  * "catalogPartitionPredicate": to use server-side
                    partition pruning using partition indexes in the
                    Glue Data Catalog.

                  * *(string) --*

                    * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / get_column_statistics_for_table


get_column_statistics_for_table
*******************************

Glue.Client.get_column_statistics_for_table(**kwargs)

   Retrieves table statistics of columns.

   The Identity and Access Management (IAM) permission required for
   this operation is "GetTable".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_column_statistics_for_table(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          ColumnNames=[
              'string',
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partitions in question reside. If none is supplied, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partitions reside.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the partitions' table.

      * **ColumnNames** (*list*) --

        **[REQUIRED]**

        A list of the column names.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ColumnStatisticsList': [
                 {
                     'ColumnName': 'string',
                     'ColumnType': 'string',
                     'AnalyzedTime': datetime(2015, 1, 1),
                     'StatisticsData': {
                         'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                         'BooleanColumnStatisticsData': {
                             'NumberOfTrues': 123,
                             'NumberOfFalses': 123,
                             'NumberOfNulls': 123
                         },
                         'DateColumnStatisticsData': {
                             'MinimumValue': datetime(2015, 1, 1),
                             'MaximumValue': datetime(2015, 1, 1),
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'DecimalColumnStatisticsData': {
                             'MinimumValue': {
                                 'UnscaledValue': b'bytes',
                                 'Scale': 123
                             },
                             'MaximumValue': {
                                 'UnscaledValue': b'bytes',
                                 'Scale': 123
                             },
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'DoubleColumnStatisticsData': {
                             'MinimumValue': 123.0,
                             'MaximumValue': 123.0,
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'LongColumnStatisticsData': {
                             'MinimumValue': 123,
                             'MaximumValue': 123,
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'StringColumnStatisticsData': {
                             'MaximumLength': 123,
                             'AverageLength': 123.0,
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'BinaryColumnStatisticsData': {
                             'MaximumLength': 123,
                             'AverageLength': 123.0,
                             'NumberOfNulls': 123
                         }
                     }
                 },
             ],
             'Errors': [
                 {
                     'ColumnName': 'string',
                     'Error': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **ColumnStatisticsList** *(list) --*

          List of ColumnStatistics.

          * *(dict) --*

            Represents the generated column-level statistics for a
            table or partition.

            * **ColumnName** *(string) --*

              Name of column which statistics belong to.

            * **ColumnType** *(string) --*

              The data type of the column.

            * **AnalyzedTime** *(datetime) --*

              The timestamp of when column statistics were generated.

            * **StatisticsData** *(dict) --*

              A "ColumnStatisticData" object that contains the
              statistics data values.

              * **Type** *(string) --*

                The type of column statistics data.

              * **BooleanColumnStatisticsData** *(dict) --*

                Boolean column statistics data.

                * **NumberOfTrues** *(integer) --*

                  The number of true values in the column.

                * **NumberOfFalses** *(integer) --*

                  The number of false values in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

              * **DateColumnStatisticsData** *(dict) --*

                Date column statistics data.

                * **MinimumValue** *(datetime) --*

                  The lowest value in the column.

                * **MaximumValue** *(datetime) --*

                  The highest value in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **DecimalColumnStatisticsData** *(dict) --*

                Decimal column statistics data. UnscaledValues within
                are Base64-encoded binary objects storing big-endian,
                two's complement representations of the decimal's
                unscaled value.

                * **MinimumValue** *(dict) --*

                  The lowest value in the column.

                  * **UnscaledValue** *(bytes) --*

                    The unscaled numeric value.

                  * **Scale** *(integer) --*

                    The scale that determines where the decimal point
                    falls in the unscaled value.

                * **MaximumValue** *(dict) --*

                  The highest value in the column.

                  * **UnscaledValue** *(bytes) --*

                    The unscaled numeric value.

                  * **Scale** *(integer) --*

                    The scale that determines where the decimal point
                    falls in the unscaled value.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **DoubleColumnStatisticsData** *(dict) --*

                Double column statistics data.

                * **MinimumValue** *(float) --*

                  The lowest value in the column.

                * **MaximumValue** *(float) --*

                  The highest value in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **LongColumnStatisticsData** *(dict) --*

                Long column statistics data.

                * **MinimumValue** *(integer) --*

                  The lowest value in the column.

                * **MaximumValue** *(integer) --*

                  The highest value in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **StringColumnStatisticsData** *(dict) --*

                String column statistics data.

                * **MaximumLength** *(integer) --*

                  The size of the longest string in the column.

                * **AverageLength** *(float) --*

                  The average string length in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **BinaryColumnStatisticsData** *(dict) --*

                Binary column statistics data.

                * **MaximumLength** *(integer) --*

                  The size of the longest bit sequence in the column.

                * **AverageLength** *(float) --*

                  The average bit sequence length in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

        * **Errors** *(list) --*

          List of ColumnStatistics that failed to be retrieved.

          * *(dict) --*

            Encapsulates a column name that failed and the reason for
            failure.

            * **ColumnName** *(string) --*

              The name of the column that failed.

            * **Error** *(dict) --*

              An error message with the reason for the failure of an
              operation.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / delete_blueprint


delete_blueprint
****************

Glue.Client.delete_blueprint(**kwargs)

   Deletes an existing blueprint.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_blueprint(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the blueprint to delete.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          Returns the name of the blueprint that was deleted.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / start_ml_labeling_set_generation_task_run


start_ml_labeling_set_generation_task_run
*****************************************

Glue.Client.start_ml_labeling_set_generation_task_run(**kwargs)

   Starts the active learning workflow for your machine learning
   transform to improve the transform's quality by generating label
   sets and adding labels.

   When the "StartMLLabelingSetGenerationTaskRun" finishes, Glue will
   have generated a "labeling set" or a set of questions for humans to
   answer.

   In the case of the "FindMatches" transform, these questions are of
   the form, “What is the correct way to group these rows together
   into groups composed entirely of matching records?”

   After the labeling process is finished, you can upload your labels
   with a call to "StartImportLabelsTaskRun". After
   "StartImportLabelsTaskRun" finishes, all future runs of the machine
   learning transform will use the new and improved labels and perform
   a higher-quality transformation.

   Note: The role used to write the generated labeling set to the
   "OutputS3Path" is the role associated with the Machine Learning
   Transform, specified in the "CreateMLTransform" API.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_ml_labeling_set_generation_task_run(
          TransformId='string',
          OutputS3Path='string'
      )

   Parameters:
      * **TransformId** (*string*) --

        **[REQUIRED]**

        The unique identifier of the machine learning transform.

      * **OutputS3Path** (*string*) --

        **[REQUIRED]**

        The Amazon Simple Storage Service (Amazon S3) path where you
        generate the labeling set.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TaskRunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TaskRunId** *(string) --*

          The unique run identifier that is associated with this task
          run.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ConcurrentRunsExceededException"
Glue / Client / get_column_statistics_for_partition


get_column_statistics_for_partition
***********************************

Glue.Client.get_column_statistics_for_partition(**kwargs)

   Retrieves partition statistics of columns.

   The Identity and Access Management (IAM) permission required for
   this operation is "GetPartition".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_column_statistics_for_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionValues=[
              'string',
          ],
          ColumnNames=[
              'string',
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partitions in question reside. If none is supplied, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partitions reside.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the partitions' table.

      * **PartitionValues** (*list*) --

        **[REQUIRED]**

        A list of partition values identifying the partition.

        * *(string) --*

      * **ColumnNames** (*list*) --

        **[REQUIRED]**

        A list of the column names.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ColumnStatisticsList': [
                 {
                     'ColumnName': 'string',
                     'ColumnType': 'string',
                     'AnalyzedTime': datetime(2015, 1, 1),
                     'StatisticsData': {
                         'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                         'BooleanColumnStatisticsData': {
                             'NumberOfTrues': 123,
                             'NumberOfFalses': 123,
                             'NumberOfNulls': 123
                         },
                         'DateColumnStatisticsData': {
                             'MinimumValue': datetime(2015, 1, 1),
                             'MaximumValue': datetime(2015, 1, 1),
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'DecimalColumnStatisticsData': {
                             'MinimumValue': {
                                 'UnscaledValue': b'bytes',
                                 'Scale': 123
                             },
                             'MaximumValue': {
                                 'UnscaledValue': b'bytes',
                                 'Scale': 123
                             },
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'DoubleColumnStatisticsData': {
                             'MinimumValue': 123.0,
                             'MaximumValue': 123.0,
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'LongColumnStatisticsData': {
                             'MinimumValue': 123,
                             'MaximumValue': 123,
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'StringColumnStatisticsData': {
                             'MaximumLength': 123,
                             'AverageLength': 123.0,
                             'NumberOfNulls': 123,
                             'NumberOfDistinctValues': 123
                         },
                         'BinaryColumnStatisticsData': {
                             'MaximumLength': 123,
                             'AverageLength': 123.0,
                             'NumberOfNulls': 123
                         }
                     }
                 },
             ],
             'Errors': [
                 {
                     'ColumnName': 'string',
                     'Error': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **ColumnStatisticsList** *(list) --*

          List of ColumnStatistics that failed to be retrieved.

          * *(dict) --*

            Represents the generated column-level statistics for a
            table or partition.

            * **ColumnName** *(string) --*

              Name of column which statistics belong to.

            * **ColumnType** *(string) --*

              The data type of the column.

            * **AnalyzedTime** *(datetime) --*

              The timestamp of when column statistics were generated.

            * **StatisticsData** *(dict) --*

              A "ColumnStatisticData" object that contains the
              statistics data values.

              * **Type** *(string) --*

                The type of column statistics data.

              * **BooleanColumnStatisticsData** *(dict) --*

                Boolean column statistics data.

                * **NumberOfTrues** *(integer) --*

                  The number of true values in the column.

                * **NumberOfFalses** *(integer) --*

                  The number of false values in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

              * **DateColumnStatisticsData** *(dict) --*

                Date column statistics data.

                * **MinimumValue** *(datetime) --*

                  The lowest value in the column.

                * **MaximumValue** *(datetime) --*

                  The highest value in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **DecimalColumnStatisticsData** *(dict) --*

                Decimal column statistics data. UnscaledValues within
                are Base64-encoded binary objects storing big-endian,
                two's complement representations of the decimal's
                unscaled value.

                * **MinimumValue** *(dict) --*

                  The lowest value in the column.

                  * **UnscaledValue** *(bytes) --*

                    The unscaled numeric value.

                  * **Scale** *(integer) --*

                    The scale that determines where the decimal point
                    falls in the unscaled value.

                * **MaximumValue** *(dict) --*

                  The highest value in the column.

                  * **UnscaledValue** *(bytes) --*

                    The unscaled numeric value.

                  * **Scale** *(integer) --*

                    The scale that determines where the decimal point
                    falls in the unscaled value.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **DoubleColumnStatisticsData** *(dict) --*

                Double column statistics data.

                * **MinimumValue** *(float) --*

                  The lowest value in the column.

                * **MaximumValue** *(float) --*

                  The highest value in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **LongColumnStatisticsData** *(dict) --*

                Long column statistics data.

                * **MinimumValue** *(integer) --*

                  The lowest value in the column.

                * **MaximumValue** *(integer) --*

                  The highest value in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **StringColumnStatisticsData** *(dict) --*

                String column statistics data.

                * **MaximumLength** *(integer) --*

                  The size of the longest string in the column.

                * **AverageLength** *(float) --*

                  The average string length in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

                * **NumberOfDistinctValues** *(integer) --*

                  The number of distinct values in a column.

              * **BinaryColumnStatisticsData** *(dict) --*

                Binary column statistics data.

                * **MaximumLength** *(integer) --*

                  The size of the longest bit sequence in the column.

                * **AverageLength** *(float) --*

                  The average bit sequence length in the column.

                * **NumberOfNulls** *(integer) --*

                  The number of null values in the column.

        * **Errors** *(list) --*

          Error occurred during retrieving column statistics data.

          * *(dict) --*

            Encapsulates a column name that failed and the reason for
            failure.

            * **ColumnName** *(string) --*

              The name of the column that failed.

            * **Error** *(dict) --*

              An error message with the reason for the failure of an
              operation.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_ml_task_run


get_ml_task_run
***************

Glue.Client.get_ml_task_run(**kwargs)

   Gets details for a specific task run on a machine learning
   transform. Machine learning task runs are asynchronous tasks that
   Glue runs on your behalf as part of various machine learning
   workflows. You can check the stats of any task run by calling
   "GetMLTaskRun" with the "TaskRunID" and its parent transform's
   "TransformID".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_ml_task_run(
          TransformId='string',
          TaskRunId='string'
      )

   Parameters:
      * **TransformId** (*string*) --

        **[REQUIRED]**

        The unique identifier of the machine learning transform.

      * **TaskRunId** (*string*) --

        **[REQUIRED]**

        The unique identifier of the task run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TransformId': 'string',
             'TaskRunId': 'string',
             'Status': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT',
             'LogGroupName': 'string',
             'Properties': {
                 'TaskType': 'EVALUATION'|'LABELING_SET_GENERATION'|'IMPORT_LABELS'|'EXPORT_LABELS'|'FIND_MATCHES',
                 'ImportLabelsTaskRunProperties': {
                     'InputS3Path': 'string',
                     'Replace': True|False
                 },
                 'ExportLabelsTaskRunProperties': {
                     'OutputS3Path': 'string'
                 },
                 'LabelingSetGenerationTaskRunProperties': {
                     'OutputS3Path': 'string'
                 },
                 'FindMatchesTaskRunProperties': {
                     'JobId': 'string',
                     'JobName': 'string',
                     'JobRunId': 'string'
                 }
             },
             'ErrorString': 'string',
             'StartedOn': datetime(2015, 1, 1),
             'LastModifiedOn': datetime(2015, 1, 1),
             'CompletedOn': datetime(2015, 1, 1),
             'ExecutionTime': 123
         }

      **Response Structure**

      * *(dict) --*

        * **TransformId** *(string) --*

          The unique identifier of the task run.

        * **TaskRunId** *(string) --*

          The unique run identifier associated with this run.

        * **Status** *(string) --*

          The status for this task run.

        * **LogGroupName** *(string) --*

          The names of the log groups that are associated with the
          task run.

        * **Properties** *(dict) --*

          The list of properties that are associated with the task
          run.

          * **TaskType** *(string) --*

            The type of task run.

          * **ImportLabelsTaskRunProperties** *(dict) --*

            The configuration properties for an importing labels task
            run.

            * **InputS3Path** *(string) --*

              The Amazon Simple Storage Service (Amazon S3) path from
              where you will import the labels.

            * **Replace** *(boolean) --*

              Indicates whether to overwrite your existing labels.

          * **ExportLabelsTaskRunProperties** *(dict) --*

            The configuration properties for an exporting labels task
            run.

            * **OutputS3Path** *(string) --*

              The Amazon Simple Storage Service (Amazon S3) path where
              you will export the labels.

          * **LabelingSetGenerationTaskRunProperties** *(dict) --*

            The configuration properties for a labeling set generation
            task run.

            * **OutputS3Path** *(string) --*

              The Amazon Simple Storage Service (Amazon S3) path where
              you will generate the labeling set.

          * **FindMatchesTaskRunProperties** *(dict) --*

            The configuration properties for a find matches task run.

            * **JobId** *(string) --*

              The job ID for the Find Matches task run.

            * **JobName** *(string) --*

              The name assigned to the job for the Find Matches task
              run.

            * **JobRunId** *(string) --*

              The job run ID for the Find Matches task run.

        * **ErrorString** *(string) --*

          The error strings that are associated with the task run.

        * **StartedOn** *(datetime) --*

          The date and time when this task run started.

        * **LastModifiedOn** *(datetime) --*

          The date and time when this task run was last modified.

        * **CompletedOn** *(datetime) --*

          The date and time when this task run was completed.

        * **ExecutionTime** *(integer) --*

          The amount of time (in seconds) that the task run consumed
          resources.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / create_partition_index


create_partition_index
**********************

Glue.Client.create_partition_index(**kwargs)

   Creates a specified partition index in an existing table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_partition_index(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionIndex={
              'Keys': [
                  'string',
              ],
              'IndexName': 'string'
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The catalog ID where the table
        resides.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        Specifies the name of a database in which you want to create a
        partition index.

      * **TableName** (*string*) --

        **[REQUIRED]**

        Specifies the name of a table in which you want to create a
        partition index.

      * **PartitionIndex** (*dict*) --

        **[REQUIRED]**

        Specifies a "PartitionIndex" structure to create a partition
        index in an existing table.

        * **Keys** *(list) --* **[REQUIRED]**

          The keys for the partition index.

          * *(string) --*

        * **IndexName** *(string) --* **[REQUIRED]**

          The name of the partition index.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / delete_connection


delete_connection
*****************

Glue.Client.delete_connection(**kwargs)

   Deletes a connection from the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_connection(
          CatalogId='string',
          ConnectionName='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which the connection resides. If none is provided, the Amazon
        Web Services account ID is used by default.

      * **ConnectionName** (*string*) --

        **[REQUIRED]**

        The name of the connection to delete.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / create_script


create_script
*************

Glue.Client.create_script(**kwargs)

   Transforms a directed acyclic graph (DAG) into code.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_script(
          DagNodes=[
              {
                  'Id': 'string',
                  'NodeType': 'string',
                  'Args': [
                      {
                          'Name': 'string',
                          'Value': 'string',
                          'Param': True|False
                      },
                  ],
                  'LineNumber': 123
              },
          ],
          DagEdges=[
              {
                  'Source': 'string',
                  'Target': 'string',
                  'TargetParameter': 'string'
              },
          ],
          Language='PYTHON'|'SCALA'
      )

   Parameters:
      * **DagNodes** (*list*) --

        A list of the nodes in the DAG.

        * *(dict) --*

          Represents a node in a directed acyclic graph (DAG)

          * **Id** *(string) --* **[REQUIRED]**

            A node identifier that is unique within the node's graph.

          * **NodeType** *(string) --* **[REQUIRED]**

            The type of node that this is.

          * **Args** *(list) --* **[REQUIRED]**

            Properties of the node, in the form of name-value pairs.

            * *(dict) --*

              An argument or property of a node.

              * **Name** *(string) --* **[REQUIRED]**

                The name of the argument or property.

              * **Value** *(string) --* **[REQUIRED]**

                The value of the argument or property.

              * **Param** *(boolean) --*

                True if the value is used as a parameter.

          * **LineNumber** *(integer) --*

            The line number of the node.

      * **DagEdges** (*list*) --

        A list of the edges in the DAG.

        * *(dict) --*

          Represents a directional edge in a directed acyclic graph
          (DAG).

          * **Source** *(string) --* **[REQUIRED]**

            The ID of the node at which the edge starts.

          * **Target** *(string) --* **[REQUIRED]**

            The ID of the node at which the edge ends.

          * **TargetParameter** *(string) --*

            The target of the edge.

      * **Language** (*string*) -- The programming language of the
        resulting code from the DAG.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'PythonScript': 'string',
             'ScalaCode': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **PythonScript** *(string) --*

          The Python script generated from the DAG.

        * **ScalaCode** *(string) --*

          The Scala code generated from the DAG.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_catalog


get_catalog
***********

Glue.Client.get_catalog(**kwargs)

   The name of the Catalog to retrieve. This should be all lowercase.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_catalog(
          CatalogId='string'
      )

   Parameters:
      **CatalogId** (*string*) --

      **[REQUIRED]**

      The ID of the parent catalog in which the catalog resides. If
      none is provided, the Amazon Web Services Account Number is used
      by default.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Catalog': {
                 'CatalogId': 'string',
                 'Name': 'string',
                 'ResourceArn': 'string',
                 'Description': 'string',
                 'Parameters': {
                     'string': 'string'
                 },
                 'CreateTime': datetime(2015, 1, 1),
                 'UpdateTime': datetime(2015, 1, 1),
                 'TargetRedshiftCatalog': {
                     'CatalogArn': 'string'
                 },
                 'FederatedCatalog': {
                     'Identifier': 'string',
                     'ConnectionName': 'string',
                     'ConnectionType': 'string'
                 },
                 'CatalogProperties': {
                     'DataLakeAccessProperties': {
                         'DataLakeAccess': True|False,
                         'DataTransferRole': 'string',
                         'KmsKey': 'string',
                         'ManagedWorkgroupName': 'string',
                         'ManagedWorkgroupStatus': 'string',
                         'RedshiftDatabaseName': 'string',
                         'StatusMessage': 'string',
                         'CatalogType': 'string'
                     },
                     'IcebergOptimizationProperties': {
                         'RoleArn': 'string',
                         'Compaction': {
                             'string': 'string'
                         },
                         'Retention': {
                             'string': 'string'
                         },
                         'OrphanFileDeletion': {
                             'string': 'string'
                         },
                         'LastUpdatedTime': datetime(2015, 1, 1)
                     },
                     'CustomProperties': {
                         'string': 'string'
                     }
                 },
                 'CreateTableDefaultPermissions': [
                     {
                         'Principal': {
                             'DataLakePrincipalIdentifier': 'string'
                         },
                         'Permissions': [
                             'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                         ]
                     },
                 ],
                 'CreateDatabaseDefaultPermissions': [
                     {
                         'Principal': {
                             'DataLakePrincipalIdentifier': 'string'
                         },
                         'Permissions': [
                             'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                         ]
                     },
                 ],
                 'AllowFullTableExternalDataAccess': 'True'|'False'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Catalog** *(dict) --*

          A "Catalog" object. The definition of the specified catalog
          in the Glue Data Catalog.

          * **CatalogId** *(string) --*

            The ID of the catalog. To grant access to the default
            catalog, this field should not be provided.

          * **Name** *(string) --*

            The name of the catalog. Cannot be the same as the account
            ID.

          * **ResourceArn** *(string) --*

            The Amazon Resource Name (ARN) assigned to the catalog
            resource.

          * **Description** *(string) --*

            Description string, not more than 2048 bytes long,
            matching the URI address multi-line string pattern. A
            description of the catalog.

          * **Parameters** *(dict) --*

            A map array of key-value pairs that define parameters and
            properties of the catalog.

            * *(string) --*

              * *(string) --*

          * **CreateTime** *(datetime) --*

            The time at which the catalog was created.

          * **UpdateTime** *(datetime) --*

            The time at which the catalog was last updated.

          * **TargetRedshiftCatalog** *(dict) --*

            A "TargetRedshiftCatalog" object that describes a target
            catalog for database resource linking.

            * **CatalogArn** *(string) --*

              The Amazon Resource Name (ARN) of the catalog resource.

          * **FederatedCatalog** *(dict) --*

            A "FederatedCatalog" object that points to an entity
            outside the Glue Data Catalog.

            * **Identifier** *(string) --*

              A unique identifier for the federated catalog.

            * **ConnectionName** *(string) --*

              The name of the connection to an external data source,
              for example a Redshift-federated catalog.

            * **ConnectionType** *(string) --*

              The type of connection used to access the federated
              catalog, specifying the protocol or method for
              connection to the external data source.

          * **CatalogProperties** *(dict) --*

            A "CatalogProperties" object that specifies data lake
            access properties and other custom properties.

            * **DataLakeAccessProperties** *(dict) --*

              A "DataLakeAccessProperties" object with input
              properties to configure data lake access for your
              catalog resource in the Glue Data Catalog.

              * **DataLakeAccess** *(boolean) --*

                Turns on or off data lake access for Apache Spark
                applications that access Amazon Redshift databases in
                the Data Catalog.

              * **DataTransferRole** *(string) --*

                A role that will be assumed by Glue for transferring
                data into/out of the staging bucket during a query.

              * **KmsKey** *(string) --*

                An encryption key that will be used for the staging
                bucket that will be created along with the catalog.

              * **ManagedWorkgroupName** *(string) --*

                The managed Redshift Serverless compute name that is
                created for your catalog resource.

              * **ManagedWorkgroupStatus** *(string) --*

                The managed Redshift Serverless compute status.

              * **RedshiftDatabaseName** *(string) --*

                The default Redshift database resource name in the
                managed compute.

              * **StatusMessage** *(string) --*

                A message that gives more detailed information about
                the managed workgroup status.

              * **CatalogType** *(string) --*

                Specifies a federated catalog type for the native
                catalog resource. The currently supported type is
                "aws:redshift".

            * **IcebergOptimizationProperties** *(dict) --*

              An "IcebergOptimizationPropertiesOutput" object that
              specifies Iceberg table optimization settings for the
              catalog, including configurations for compaction,
              retention, and orphan file deletion operations.

              * **RoleArn** *(string) --*

                The Amazon Resource Name (ARN) of the IAM role that is
                used to perform Iceberg table optimization operations.

              * **Compaction** *(dict) --*

                A map of key-value pairs that specify configuration
                parameters for Iceberg table compaction operations,
                which optimize the layout of data files to improve
                query performance.

                * *(string) --*

                  * *(string) --*

              * **Retention** *(dict) --*

                A map of key-value pairs that specify configuration
                parameters for Iceberg table retention operations,
                which manage the lifecycle of table snapshots to
                control storage costs.

                * *(string) --*

                  * *(string) --*

              * **OrphanFileDeletion** *(dict) --*

                A map of key-value pairs that specify configuration
                parameters for Iceberg orphan file deletion
                operations, which identify and remove files that are
                no longer referenced by the table metadata.

                * *(string) --*

                  * *(string) --*

              * **LastUpdatedTime** *(datetime) --*

                The timestamp when the Iceberg optimization properties
                were last updated.

            * **CustomProperties** *(dict) --*

              Additional key-value properties for the catalog, such as
              column statistics optimizations.

              * *(string) --*

                * *(string) --*

          * **CreateTableDefaultPermissions** *(list) --*

            An array of "PrincipalPermissions" objects. Creates a set
            of default permissions on the table(s) for principals.
            Used by Amazon Web Services Lake Formation. Not used in
            the normal course of Glue operations.

            * *(dict) --*

              Permissions granted to a principal.

              * **Principal** *(dict) --*

                The principal who is granted permissions.

                * **DataLakePrincipalIdentifier** *(string) --*

                  An identifier for the Lake Formation principal.

              * **Permissions** *(list) --*

                The permissions that are granted to the principal.

                * *(string) --*

          * **CreateDatabaseDefaultPermissions** *(list) --*

            An array of "PrincipalPermissions" objects. Creates a set
            of default permissions on the database(s) for principals.
            Used by Amazon Web Services Lake Formation. Not used in
            the normal course of Glue operations.

            * *(dict) --*

              Permissions granted to a principal.

              * **Principal** *(dict) --*

                The principal who is granted permissions.

                * **DataLakePrincipalIdentifier** *(string) --*

                  An identifier for the Lake Formation principal.

              * **Permissions** *(list) --*

                The permissions that are granted to the principal.

                * *(string) --*

          * **AllowFullTableExternalDataAccess** *(string) --*

            Allows third-party engines to access data in Amazon S3
            locations that are registered with Lake Formation.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / create_blueprint


create_blueprint
****************

Glue.Client.create_blueprint(**kwargs)

   Registers a blueprint with Glue.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_blueprint(
          Name='string',
          Description='string',
          BlueprintLocation='string',
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the blueprint.

      * **Description** (*string*) -- A description of the blueprint.

      * **BlueprintLocation** (*string*) --

        **[REQUIRED]**

        Specifies a path in Amazon S3 where the blueprint is
        published.

      * **Tags** (*dict*) --

        The tags to be applied to this blueprint.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          Returns the name of the blueprint that was registered.

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / list_workflows


list_workflows
**************

Glue.Client.list_workflows(**kwargs)

   Lists names of workflows created in the account.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_workflows(
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Workflows': [
                 'string',
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Workflows** *(list) --*

          List of names of workflows in the account.

          * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, if not all workflow names have been
          returned.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_tags


get_tags
********

Glue.Client.get_tags(**kwargs)

   Retrieves a list of tags associated with a resource.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_tags(
          ResourceArn='string'
      )

   Parameters:
      **ResourceArn** (*string*) --

      **[REQUIRED]**

      The Amazon Resource Name (ARN) of the resource for which to
      retrieve tags.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Tags': {
                 'string': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Tags** *(dict) --*

          The requested tags.

          * *(string) --*

            * *(string) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.EntityNotFoundException"
Glue / Client / get_user_defined_function


get_user_defined_function
*************************

Glue.Client.get_user_defined_function(**kwargs)

   Retrieves a specified function definition from the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_user_defined_function(
          CatalogId='string',
          DatabaseName='string',
          FunctionName='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the function to be retrieved is located. If none is provided,
        the Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the function is
        located.

      * **FunctionName** (*string*) --

        **[REQUIRED]**

        The name of the function.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'UserDefinedFunction': {
                 'FunctionName': 'string',
                 'DatabaseName': 'string',
                 'ClassName': 'string',
                 'OwnerName': 'string',
                 'OwnerType': 'USER'|'ROLE'|'GROUP',
                 'CreateTime': datetime(2015, 1, 1),
                 'ResourceUris': [
                     {
                         'ResourceType': 'JAR'|'FILE'|'ARCHIVE',
                         'Uri': 'string'
                     },
                 ],
                 'CatalogId': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **UserDefinedFunction** *(dict) --*

          The requested function definition.

          * **FunctionName** *(string) --*

            The name of the function.

          * **DatabaseName** *(string) --*

            The name of the catalog database that contains the
            function.

          * **ClassName** *(string) --*

            The Java class that contains the function code.

          * **OwnerName** *(string) --*

            The owner of the function.

          * **OwnerType** *(string) --*

            The owner type.

          * **CreateTime** *(datetime) --*

            The time at which the function was created.

          * **ResourceUris** *(list) --*

            The resource URIs for the function.

            * *(dict) --*

              The URIs for function resources.

              * **ResourceType** *(string) --*

                The type of the resource.

              * **Uri** *(string) --*

                The URI for accessing the resource.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the function resides.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / batch_stop_job_run


batch_stop_job_run
******************

Glue.Client.batch_stop_job_run(**kwargs)

   Stops one or more job runs for a specified job definition.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_stop_job_run(
          JobName='string',
          JobRunIds=[
              'string',
          ]
      )

   Parameters:
      * **JobName** (*string*) --

        **[REQUIRED]**

        The name of the job definition for which to stop job runs.

      * **JobRunIds** (*list*) --

        **[REQUIRED]**

        A list of the "JobRunIds" that should be stopped for that job
        definition.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SuccessfulSubmissions': [
                 {
                     'JobName': 'string',
                     'JobRunId': 'string'
                 },
             ],
             'Errors': [
                 {
                     'JobName': 'string',
                     'JobRunId': 'string',
                     'ErrorDetail': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **SuccessfulSubmissions** *(list) --*

          A list of the JobRuns that were successfully submitted for
          stopping.

          * *(dict) --*

            Records a successful request to stop a specified "JobRun".

            * **JobName** *(string) --*

              The name of the job definition used in the job run that
              was stopped.

            * **JobRunId** *(string) --*

              The "JobRunId" of the job run that was stopped.

        * **Errors** *(list) --*

          A list of the errors that were encountered in trying to stop
          "JobRuns", including the "JobRunId" for which each error was
          encountered and details about the error.

          * *(dict) --*

            Records an error that occurred when attempting to stop a
            specified job run.

            * **JobName** *(string) --*

              The name of the job definition that is used in the job
              run in question.

            * **JobRunId** *(string) --*

              The "JobRunId" of the job run in question.

            * **ErrorDetail** *(dict) --*

              Specifies details about the error that was encountered.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / delete_table_optimizer


delete_table_optimizer
**********************

Glue.Client.delete_table_optimizer(**kwargs)

   Deletes an optimizer and all associated metadata for a table. The
   optimization will no longer be performed on the table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_table_optimizer(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          Type='compaction'|'retention'|'orphan_file_deletion'
      )

   Parameters:
      * **CatalogId** (*string*) --

        **[REQUIRED]**

        The Catalog ID of the table.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database in the catalog in which the table
        resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table.

      * **Type** (*string*) --

        **[REQUIRED]**

        The type of table optimizer.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ThrottlingException"
Glue / Client / list_data_quality_statistic_annotations


list_data_quality_statistic_annotations
***************************************

Glue.Client.list_data_quality_statistic_annotations(**kwargs)

   Retrieve annotations for a data quality statistic.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_data_quality_statistic_annotations(
          StatisticId='string',
          ProfileId='string',
          TimestampFilter={
              'RecordedBefore': datetime(2015, 1, 1),
              'RecordedAfter': datetime(2015, 1, 1)
          },
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **StatisticId** (*string*) -- The Statistic ID.

      * **ProfileId** (*string*) -- The Profile ID.

      * **TimestampFilter** (*dict*) --

        A timestamp filter.

        * **RecordedBefore** *(datetime) --*

          The timestamp before which statistics should be included in
          the results.

        * **RecordedAfter** *(datetime) --*

          The timestamp after which statistics should be included in
          the results.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return in this request.

      * **NextToken** (*string*) -- A pagination token to retrieve the
        next set of results.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Annotations': [
                 {
                     'ProfileId': 'string',
                     'StatisticId': 'string',
                     'StatisticRecordedOn': datetime(2015, 1, 1),
                     'InclusionAnnotation': {
                         'Value': 'INCLUDE'|'EXCLUDE',
                         'LastModifiedOn': datetime(2015, 1, 1)
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Annotations** *(list) --*

          A list of "StatisticAnnotation" applied to the Statistic

          * *(dict) --*

            A Statistic Annotation.

            * **ProfileId** *(string) --*

              The Profile ID.

            * **StatisticId** *(string) --*

              The Statistic ID.

            * **StatisticRecordedOn** *(datetime) --*

              The timestamp when the annotated statistic was recorded.

            * **InclusionAnnotation** *(dict) --*

              The inclusion annotation applied to the statistic.

              * **Value** *(string) --*

                The inclusion annotation value.

              * **LastModifiedOn** *(datetime) --*

                The timestamp when the inclusion annotation was last
                modified.

        * **NextToken** *(string) --*

          A pagination token to retrieve the next set of results.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / delete_column_statistics_task_settings


delete_column_statistics_task_settings
**************************************

Glue.Client.delete_column_statistics_task_settings(**kwargs)

   Deletes settings for a column statistics task.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_column_statistics_task_settings(
          DatabaseName='string',
          TableName='string'
      )

   Parameters:
      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database where the table resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table for which to delete column statistics.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_jobs


get_jobs
********

Glue.Client.get_jobs(**kwargs)

   Retrieves all current job definitions.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_jobs(
          NextToken='string',
          MaxResults=123
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

      * **MaxResults** (*integer*) -- The maximum size of the
        response.

   Return type:
      dict

   Returns:
      **Response Syntax**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

      **Response Structure**

         # This section is too large to render.
         # Please see the AWS API Documentation linked below.

      AWS API Documentation

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / update_ml_transform


update_ml_transform
*******************

Glue.Client.update_ml_transform(**kwargs)

   Updates an existing machine learning transform. Call this operation
   to tune the algorithm parameters to achieve better results.

   After calling this operation, you can call the
   "StartMLEvaluationTaskRun" operation to assess how well your new
   parameters achieved your goals (such as improving the quality of
   your machine learning transform, or making it more cost-effective).

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_ml_transform(
          TransformId='string',
          Name='string',
          Description='string',
          Parameters={
              'TransformType': 'FIND_MATCHES',
              'FindMatchesParameters': {
                  'PrimaryKeyColumnName': 'string',
                  'PrecisionRecallTradeoff': 123.0,
                  'AccuracyCostTradeoff': 123.0,
                  'EnforceProvidedLabels': True|False
              }
          },
          Role='string',
          GlueVersion='string',
          MaxCapacity=123.0,
          WorkerType='Standard'|'G.1X'|'G.2X'|'G.025X'|'G.4X'|'G.8X'|'Z.2X',
          NumberOfWorkers=123,
          Timeout=123,
          MaxRetries=123
      )

   Parameters:
      * **TransformId** (*string*) --

        **[REQUIRED]**

        A unique identifier that was generated when the transform was
        created.

      * **Name** (*string*) -- The unique name that you gave the
        transform when you created it.

      * **Description** (*string*) -- A description of the transform.
        The default is an empty string.

      * **Parameters** (*dict*) --

        The configuration parameters that are specific to the
        transform type (algorithm) used. Conditionally dependent on
        the transform type.

        * **TransformType** *(string) --* **[REQUIRED]**

          The type of machine learning transform.

          For information about the types of machine learning
          transforms, see Creating Machine Learning Transforms.

        * **FindMatchesParameters** *(dict) --*

          The parameters for the find matches algorithm.

          * **PrimaryKeyColumnName** *(string) --*

            The name of a column that uniquely identifies rows in the
            source table. Used to help identify matching records.

          * **PrecisionRecallTradeoff** *(float) --*

            The value selected when tuning your transform for a
            balance between precision and recall. A value of 0.5 means
            no preference; a value of 1.0 means a bias purely for
            precision, and a value of 0.0 means a bias for recall.
            Because this is a tradeoff, choosing values close to 1.0
            means very low recall, and choosing values close to 0.0
            results in very low precision.

            The precision metric indicates how often your model is
            correct when it predicts a match.

            The recall metric indicates that for an actual match, how
            often your model predicts the match.

          * **AccuracyCostTradeoff** *(float) --*

            The value that is selected when tuning your transform for
            a balance between accuracy and cost. A value of 0.5 means
            that the system balances accuracy and cost concerns. A
            value of 1.0 means a bias purely for accuracy, which
            typically results in a higher cost, sometimes
            substantially higher. A value of 0.0 means a bias purely
            for cost, which results in a less accurate "FindMatches"
            transform, sometimes with unacceptable accuracy.

            Accuracy measures how well the transform finds true
            positives and true negatives. Increasing accuracy requires
            more machine resources and cost. But it also results in
            increased recall.

            Cost measures how many compute resources, and thus money,
            are consumed to run the transform.

          * **EnforceProvidedLabels** *(boolean) --*

            The value to switch on or off to force the output to match
            the provided labels from users. If the value is "True",
            the "find matches" transform forces the output to match
            the provided labels. The results override the normal
            conflation results. If the value is "False", the "find
            matches" transform does not ensure all the labels provided
            are respected, and the results rely on the trained model.

            Note that setting this value to true may increase the
            conflation execution time.

      * **Role** (*string*) -- The name or Amazon Resource Name (ARN)
        of the IAM role with the required permissions.

      * **GlueVersion** (*string*) -- This value determines which
        version of Glue this machine learning transform is compatible
        with. Glue 1.0 is recommended for most customers. If the value
        is not set, the Glue compatibility defaults to Glue 0.9. For
        more information, see Glue Versions in the developer guide.

      * **MaxCapacity** (*float*) --

        The number of Glue data processing units (DPUs) that are
        allocated to task runs for this transform. You can allocate
        from 2 to 100 DPUs; the default is 10. A DPU is a relative
        measure of processing power that consists of 4 vCPUs of
        compute capacity and 16 GB of memory. For more information,
        see the Glue pricing page.

        When the "WorkerType" field is set to a value other than
        "Standard", the "MaxCapacity" field is set automatically and
        becomes read-only.

      * **WorkerType** (*string*) --

        The type of predefined worker that is allocated when this task
        runs. Accepts a value of Standard, G.1X, or G.2X.

        * For the "Standard" worker type, each worker provides 4 vCPU,
          16 GB of memory and a 50GB disk, and 2 executors per worker.

        * For the "G.1X" worker type, each worker provides 4 vCPU, 16
          GB of memory and a 64GB disk, and 1 executor per worker.

        * For the "G.2X" worker type, each worker provides 8 vCPU, 32
          GB of memory and a 128GB disk, and 1 executor per worker.

      * **NumberOfWorkers** (*integer*) -- The number of workers of a
        defined "workerType" that are allocated when this task runs.

      * **Timeout** (*integer*) -- The timeout for a task run for this
        transform in minutes. This is the maximum time that a task run
        for this transform can consume resources before it is
        terminated and enters "TIMEOUT" status. The default is 2,880
        minutes (48 hours).

      * **MaxRetries** (*integer*) -- The maximum number of times to
        retry a task for this transform after a task run fails.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TransformId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TransformId** *(string) --*

          The unique identifier for the transform that was updated.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.AccessDeniedException"
Glue / Client / get_integration_resource_property


get_integration_resource_property
*********************************

Glue.Client.get_integration_resource_property(**kwargs)

   This API is used for fetching the "ResourceProperty" of the Glue
   connection (for the source) or Glue database ARN (for the target)

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_integration_resource_property(
          ResourceArn='string'
      )

   Parameters:
      **ResourceArn** (*string*) --

      **[REQUIRED]**

      The connection ARN of the source, or the database ARN of the
      target.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ResourceArn': 'string',
             'SourceProcessingProperties': {
                 'RoleArn': 'string'
             },
             'TargetProcessingProperties': {
                 'RoleArn': 'string',
                 'KmsArn': 'string',
                 'ConnectionName': 'string',
                 'EventBusArn': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **ResourceArn** *(string) --*

          The connection ARN of the source, or the database ARN of the
          target.

        * **SourceProcessingProperties** *(dict) --*

          The resource properties associated with the integration
          source.

          * **RoleArn** *(string) --*

            The IAM role to access the Glue connection.

        * **TargetProcessingProperties** *(dict) --*

          The resource properties associated with the integration
          target.

          * **RoleArn** *(string) --*

            The IAM role to access the Glue database.

          * **KmsArn** *(string) --*

            The ARN of the KMS key used for encryption.

          * **ConnectionName** *(string) --*

            The Glue network connection to configure the Glue job
            running in the customer VPC.

          * **EventBusArn** *(string) --*

            The ARN of an Eventbridge event bus to receive the
            integration status notification.

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.ResourceNotFoundException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / batch_create_partition


batch_create_partition
**********************

Glue.Client.batch_create_partition(**kwargs)

   Creates one or more partitions in a batch operation.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.batch_create_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionInputList=[
              {
                  'Values': [
                      'string',
                  ],
                  'LastAccessTime': datetime(2015, 1, 1),
                  'StorageDescriptor': {
                      'Columns': [
                          {
                              'Name': 'string',
                              'Type': 'string',
                              'Comment': 'string',
                              'Parameters': {
                                  'string': 'string'
                              }
                          },
                      ],
                      'Location': 'string',
                      'AdditionalLocations': [
                          'string',
                      ],
                      'InputFormat': 'string',
                      'OutputFormat': 'string',
                      'Compressed': True|False,
                      'NumberOfBuckets': 123,
                      'SerdeInfo': {
                          'Name': 'string',
                          'SerializationLibrary': 'string',
                          'Parameters': {
                              'string': 'string'
                          }
                      },
                      'BucketColumns': [
                          'string',
                      ],
                      'SortColumns': [
                          {
                              'Column': 'string',
                              'SortOrder': 123
                          },
                      ],
                      'Parameters': {
                          'string': 'string'
                      },
                      'SkewedInfo': {
                          'SkewedColumnNames': [
                              'string',
                          ],
                          'SkewedColumnValues': [
                              'string',
                          ],
                          'SkewedColumnValueLocationMaps': {
                              'string': 'string'
                          }
                      },
                      'StoredAsSubDirectories': True|False,
                      'SchemaReference': {
                          'SchemaId': {
                              'SchemaArn': 'string',
                              'SchemaName': 'string',
                              'RegistryName': 'string'
                          },
                          'SchemaVersionId': 'string',
                          'SchemaVersionNumber': 123
                      }
                  },
                  'Parameters': {
                      'string': 'string'
                  },
                  'LastAnalyzedTime': datetime(2015, 1, 1)
              },
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the catalog in which the
        partition is to be created. Currently, this should be the
        Amazon Web Services account ID.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the metadata database in which the partition is to
        be created.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the metadata table in which the partition is to be
        created.

      * **PartitionInputList** (*list*) --

        **[REQUIRED]**

        A list of "PartitionInput" structures that define the
        partitions to be created.

        * *(dict) --*

          The structure used to create and update a partition.

          * **Values** *(list) --*

            The values of the partition. Although this parameter is
            not required by the SDK, you must specify this parameter
            for a valid input.

            The values for the keys for the new partition must be
            passed as an array of String objects that must be ordered
            in the same order as the partition keys appearing in the
            Amazon S3 prefix. Otherwise Glue will add the values to
            the wrong keys.

            * *(string) --*

          * **LastAccessTime** *(datetime) --*

            The last time at which the partition was accessed.

          * **StorageDescriptor** *(dict) --*

            Provides information about the physical location where the
            partition is stored.

            * **Columns** *(list) --*

              A list of the "Columns" in the table.

              * *(dict) --*

                A column in a "Table".

                * **Name** *(string) --* **[REQUIRED]**

                  The name of the "Column".

                * **Type** *(string) --*

                  The data type of the "Column".

                * **Comment** *(string) --*

                  A free-form text comment.

                * **Parameters** *(dict) --*

                  These key-value pairs define properties associated
                  with the column.

                  * *(string) --*

                    * *(string) --*

            * **Location** *(string) --*

              The physical location of the table. By default, this
              takes the form of the warehouse location, followed by
              the database location in the warehouse, followed by the
              table name.

            * **AdditionalLocations** *(list) --*

              A list of locations that point to the path where a Delta
              table is located.

              * *(string) --*

            * **InputFormat** *(string) --*

              The input format: "SequenceFileInputFormat" (binary), or
              "TextInputFormat", or a custom format.

            * **OutputFormat** *(string) --*

              The output format: "SequenceFileOutputFormat" (binary),
              or "IgnoreKeyTextOutputFormat", or a custom format.

            * **Compressed** *(boolean) --*

              "True" if the data in the table is compressed, or
              "False" if not.

            * **NumberOfBuckets** *(integer) --*

              Must be specified if the table contains any dimension
              columns.

            * **SerdeInfo** *(dict) --*

              The serialization/deserialization (SerDe) information.

              * **Name** *(string) --*

                Name of the SerDe.

              * **SerializationLibrary** *(string) --*

                Usually the class that implements the SerDe. An
                example is "org.apache.hadoop.hive.serde2.columnar.Co
                lumnarSerDe".

              * **Parameters** *(dict) --*

                These key-value pairs define initialization parameters
                for the SerDe.

                * *(string) --*

                  * *(string) --*

            * **BucketColumns** *(list) --*

              A list of reducer grouping columns, clustering columns,
              and bucketing columns in the table.

              * *(string) --*

            * **SortColumns** *(list) --*

              A list specifying the sort order of each bucket in the
              table.

              * *(dict) --*

                Specifies the sort order of a sorted column.

                * **Column** *(string) --* **[REQUIRED]**

                  The name of the column.

                * **SortOrder** *(integer) --* **[REQUIRED]**

                  Indicates that the column is sorted in ascending
                  order ( "== 1"), or in descending order ( "==0").

            * **Parameters** *(dict) --*

              The user-supplied properties in key-value form.

              * *(string) --*

                * *(string) --*

            * **SkewedInfo** *(dict) --*

              The information about values that appear frequently in a
              column (skewed values).

              * **SkewedColumnNames** *(list) --*

                A list of names of columns that contain skewed values.

                * *(string) --*

              * **SkewedColumnValues** *(list) --*

                A list of values that appear so frequently as to be
                considered skewed.

                * *(string) --*

              * **SkewedColumnValueLocationMaps** *(dict) --*

                A mapping of skewed values to the columns that contain
                them.

                * *(string) --*

                  * *(string) --*

            * **StoredAsSubDirectories** *(boolean) --*

              "True" if the table data is stored in subdirectories, or
              "False" if not.

            * **SchemaReference** *(dict) --*

              An object that references a schema stored in the Glue
              Schema Registry.

              When creating a table, you can pass an empty list of
              columns for the schema, and instead use a schema
              reference.

              * **SchemaId** *(dict) --*

                A structure that contains schema identity fields.
                Either this or the "SchemaVersionId" has to be
                provided.

                * **SchemaArn** *(string) --*

                  The Amazon Resource Name (ARN) of the schema. One of
                  "SchemaArn" or "SchemaName" has to be provided.

                * **SchemaName** *(string) --*

                  The name of the schema. One of "SchemaArn" or
                  "SchemaName" has to be provided.

                * **RegistryName** *(string) --*

                  The name of the schema registry that contains the
                  schema.

              * **SchemaVersionId** *(string) --*

                The unique ID assigned to a version of the schema.
                Either this or the "SchemaId" has to be provided.

              * **SchemaVersionNumber** *(integer) --*

                The version number of the schema.

          * **Parameters** *(dict) --*

            These key-value pairs define partition parameters.

            * *(string) --*

              * *(string) --*

          * **LastAnalyzedTime** *(datetime) --*

            The last time at which column statistics were computed for
            this partition.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Errors': [
                 {
                     'PartitionValues': [
                         'string',
                     ],
                     'ErrorDetail': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **Errors** *(list) --*

          The errors encountered when trying to create the requested
          partitions.

          * *(dict) --*

            Contains information about a partition error.

            * **PartitionValues** *(list) --*

              The values that define the partition.

              * *(string) --*

            * **ErrorDetail** *(dict) --*

              The details about the partition error.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_partition_indexes


get_partition_indexes
*********************

Glue.Client.get_partition_indexes(**kwargs)

   Retrieves the partition indexes associated with a table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_partition_indexes(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          NextToken='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The catalog ID where the table
        resides.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        Specifies the name of a database from which you want to
        retrieve partition indexes.

      * **TableName** (*string*) --

        **[REQUIRED]**

        Specifies the name of a table for which you want to retrieve
        the partition indexes.

      * **NextToken** (*string*) -- A continuation token, included if
        this is a continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'PartitionIndexDescriptorList': [
                 {
                     'IndexName': 'string',
                     'Keys': [
                         {
                             'Name': 'string',
                             'Type': 'string'
                         },
                     ],
                     'IndexStatus': 'CREATING'|'ACTIVE'|'DELETING'|'FAILED',
                     'BackfillErrors': [
                         {
                             'Code': 'ENCRYPTED_PARTITION_ERROR'|'INTERNAL_ERROR'|'INVALID_PARTITION_TYPE_DATA_ERROR'|'MISSING_PARTITION_VALUE_ERROR'|'UNSUPPORTED_PARTITION_CHARACTER_ERROR',
                             'Partitions': [
                                 {
                                     'Values': [
                                         'string',
                                     ]
                                 },
                             ]
                         },
                     ]
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **PartitionIndexDescriptorList** *(list) --*

          A list of index descriptors.

          * *(dict) --*

            A descriptor for a partition index in a table.

            * **IndexName** *(string) --*

              The name of the partition index.

            * **Keys** *(list) --*

              A list of one or more keys, as "KeySchemaElement"
              structures, for the partition index.

              * *(dict) --*

                A partition key pair consisting of a name and a type.

                * **Name** *(string) --*

                  The name of a partition key.

                * **Type** *(string) --*

                  The type of a partition key.

            * **IndexStatus** *(string) --*

              The status of the partition index.

              The possible statuses are:

              * CREATING: The index is being created. When an index is
                in a CREATING state, the index or its table cannot be
                deleted.

              * ACTIVE: The index creation succeeds.

              * FAILED: The index creation fails.

              * DELETING: The index is deleted from the list of
                indexes.

            * **BackfillErrors** *(list) --*

              A list of errors that can occur when registering
              partition indexes for an existing table.

              * *(dict) --*

                A list of errors that can occur when registering
                partition indexes for an existing table.

                These errors give the details about why an index
                registration failed and provide a limited number of
                partitions in the response, so that you can fix the
                partitions at fault and try registering the index
                again. The most common set of errors that can occur
                are categorized as follows:

                * EncryptedPartitionError: The partitions are
                  encrypted.

                * InvalidPartitionTypeDataError: The partition value
                  doesn't match the data type for that partition
                  column.

                * MissingPartitionValueError: The partitions are
                  encrypted.

                * UnsupportedPartitionCharacterError: Characters
                  inside the partition value are not supported. For
                  example: U+0000 , U+0001, U+0002.

                * InternalError: Any error which does not belong to
                  other error codes.

                * **Code** *(string) --*

                  The error code for an error that occurred when
                  registering partition indexes for an existing table.

                * **Partitions** *(list) --*

                  A list of a limited number of partitions in the
                  response.

                  * *(dict) --*

                    Contains a list of values defining partitions.

                    * **Values** *(list) --*

                      The list of values.

                      * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, present if the current list segment is
          not the last.

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ConflictException"
Glue / Client / get_unfiltered_table_metadata


get_unfiltered_table_metadata
*****************************

Glue.Client.get_unfiltered_table_metadata(**kwargs)

   Allows a third-party analytical engine to retrieve unfiltered table
   metadata from the Data Catalog.

   For IAM authorization, the public IAM action associated with this
   API is "glue:GetTable".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_unfiltered_table_metadata(
          Region='string',
          CatalogId='string',
          DatabaseName='string',
          Name='string',
          AuditContext={
              'AdditionalAuditContext': 'string',
              'RequestedColumns': [
                  'string',
              ],
              'AllColumnsRequested': True|False
          },
          SupportedPermissionTypes=[
              'COLUMN_PERMISSION'|'CELL_FILTER_PERMISSION'|'NESTED_PERMISSION'|'NESTED_CELL_PERMISSION',
          ],
          ParentResourceArn='string',
          RootResourceArn='string',
          SupportedDialect={
              'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
              'DialectVersion': 'string'
          },
          Permissions=[
              'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
          ],
          QuerySessionContext={
              'QueryId': 'string',
              'QueryStartTime': datetime(2015, 1, 1),
              'ClusterId': 'string',
              'QueryAuthorizationId': 'string',
              'AdditionalContext': {
                  'string': 'string'
              }
          }
      )

   Parameters:
      * **Region** (*string*) -- Specified only if the base tables
        belong to a different Amazon Web Services Region.

      * **CatalogId** (*string*) --

        **[REQUIRED]**

        The catalog ID where the table resides.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        (Required) Specifies the name of a database that contains the
        table.

      * **Name** (*string*) --

        **[REQUIRED]**

        (Required) Specifies the name of a table for which you are
        requesting metadata.

      * **AuditContext** (*dict*) --

        A structure containing Lake Formation audit context
        information.

        * **AdditionalAuditContext** *(string) --*

          A string containing the additional audit context
          information.

        * **RequestedColumns** *(list) --*

          The requested columns for audit.

          * *(string) --*

        * **AllColumnsRequested** *(boolean) --*

          All columns request for audit.

      * **SupportedPermissionTypes** (*list*) --

        **[REQUIRED]**

        Indicates the level of filtering a third-party analytical
        engine is capable of enforcing when calling the
        "GetUnfilteredTableMetadata" API operation. Accepted values
        are:

        * "COLUMN_PERMISSION" - Column permissions ensure that users
          can access only specific columns in the table. If there are
          particular columns contain sensitive data, data lake
          administrators can define column filters that exclude access
          to specific columns.

        * "CELL_FILTER_PERMISSION" - Cell-level filtering combines
          column filtering (include or exclude columns) and row filter
          expressions to restrict access to individual elements in the
          table.

        * "NESTED_PERMISSION" - Nested permissions combines cell-level
          filtering and nested column filtering to restrict access to
          columns and/or nested columns in specific rows based on row
          filter expressions.

        * "NESTED_CELL_PERMISSION" - Nested cell permissions combines
          nested permission with nested cell-level filtering. This
          allows different subsets of nested columns to be restricted
          based on an array of row filter expressions.

        Note: Each of these permission types follows a hierarchical
        order where each subsequent permission type includes all
        permission of the previous type.

        Important: If you provide a supported permission type that
        doesn't match the user's level of permissions on the table,
        then Lake Formation raises an exception. For example, if the
        third-party engine calling the "GetUnfilteredTableMetadata"
        operation can enforce only column-level filtering, and the
        user has nested cell filtering applied on the table, Lake
        Formation throws an exception, and will not return unfiltered
        table metadata and data access credentials.

        * *(string) --*

      * **ParentResourceArn** (*string*) -- The resource ARN of the
        view.

      * **RootResourceArn** (*string*) -- The resource ARN of the root
        view in a chain of nested views.

      * **SupportedDialect** (*dict*) --

        A structure specifying the dialect and dialect version used by
        the query engine.

        * **Dialect** *(string) --*

          The dialect of the query engine.

        * **DialectVersion** *(string) --*

          The version of the dialect of the query engine. For example,
          3.0.0.

      * **Permissions** (*list*) --

        The Lake Formation data permissions of the caller on the
        table. Used to authorize the call when no view context is
        found.

        * *(string) --*

      * **QuerySessionContext** (*dict*) --

        A structure used as a protocol between query engines and Lake
        Formation or Glue. Contains both a Lake Formation generated
        authorization identifier and information from the request's
        authorization context.

        * **QueryId** *(string) --*

          A unique identifier generated by the query engine for the
          query.

        * **QueryStartTime** *(datetime) --*

          A timestamp provided by the query engine for when the query
          started.

        * **ClusterId** *(string) --*

          An identifier string for the consumer cluster.

        * **QueryAuthorizationId** *(string) --*

          A cryptographically generated query identifier generated by
          Glue or Lake Formation.

        * **AdditionalContext** *(dict) --*

          An opaque string-string map passed by the query engine.

          * *(string) --*

            * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Table': {
                 'Name': 'string',
                 'DatabaseName': 'string',
                 'Description': 'string',
                 'Owner': 'string',
                 'CreateTime': datetime(2015, 1, 1),
                 'UpdateTime': datetime(2015, 1, 1),
                 'LastAccessTime': datetime(2015, 1, 1),
                 'LastAnalyzedTime': datetime(2015, 1, 1),
                 'Retention': 123,
                 'StorageDescriptor': {
                     'Columns': [
                         {
                             'Name': 'string',
                             'Type': 'string',
                             'Comment': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                     ],
                     'Location': 'string',
                     'AdditionalLocations': [
                         'string',
                     ],
                     'InputFormat': 'string',
                     'OutputFormat': 'string',
                     'Compressed': True|False,
                     'NumberOfBuckets': 123,
                     'SerdeInfo': {
                         'Name': 'string',
                         'SerializationLibrary': 'string',
                         'Parameters': {
                             'string': 'string'
                         }
                     },
                     'BucketColumns': [
                         'string',
                     ],
                     'SortColumns': [
                         {
                             'Column': 'string',
                             'SortOrder': 123
                         },
                     ],
                     'Parameters': {
                         'string': 'string'
                     },
                     'SkewedInfo': {
                         'SkewedColumnNames': [
                             'string',
                         ],
                         'SkewedColumnValues': [
                             'string',
                         ],
                         'SkewedColumnValueLocationMaps': {
                             'string': 'string'
                         }
                     },
                     'StoredAsSubDirectories': True|False,
                     'SchemaReference': {
                         'SchemaId': {
                             'SchemaArn': 'string',
                             'SchemaName': 'string',
                             'RegistryName': 'string'
                         },
                         'SchemaVersionId': 'string',
                         'SchemaVersionNumber': 123
                     }
                 },
                 'PartitionKeys': [
                     {
                         'Name': 'string',
                         'Type': 'string',
                         'Comment': 'string',
                         'Parameters': {
                             'string': 'string'
                         }
                     },
                 ],
                 'ViewOriginalText': 'string',
                 'ViewExpandedText': 'string',
                 'TableType': 'string',
                 'Parameters': {
                     'string': 'string'
                 },
                 'CreatedBy': 'string',
                 'IsRegisteredWithLakeFormation': True|False,
                 'TargetTable': {
                     'CatalogId': 'string',
                     'DatabaseName': 'string',
                     'Name': 'string',
                     'Region': 'string'
                 },
                 'CatalogId': 'string',
                 'VersionId': 'string',
                 'FederatedTable': {
                     'Identifier': 'string',
                     'DatabaseIdentifier': 'string',
                     'ConnectionName': 'string',
                     'ConnectionType': 'string'
                 },
                 'ViewDefinition': {
                     'IsProtected': True|False,
                     'Definer': 'string',
                     'SubObjects': [
                         'string',
                     ],
                     'Representations': [
                         {
                             'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                             'DialectVersion': 'string',
                             'ViewOriginalText': 'string',
                             'ViewExpandedText': 'string',
                             'ValidationConnection': 'string',
                             'IsStale': True|False
                         },
                     ]
                 },
                 'IsMultiDialectView': True|False,
                 'Status': {
                     'RequestedBy': 'string',
                     'UpdatedBy': 'string',
                     'RequestTime': datetime(2015, 1, 1),
                     'UpdateTime': datetime(2015, 1, 1),
                     'Action': 'UPDATE'|'CREATE',
                     'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                     'Error': {
                         'ErrorCode': 'string',
                         'ErrorMessage': 'string'
                     },
                     'Details': {
                         'RequestedChange': {'... recursive ...'},
                         'ViewValidations': [
                             {
                                 'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                 'DialectVersion': 'string',
                                 'ViewValidationText': 'string',
                                 'UpdateTime': datetime(2015, 1, 1),
                                 'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                                 'Error': {
                                     'ErrorCode': 'string',
                                     'ErrorMessage': 'string'
                                 }
                             },
                         ]
                     }
                 }
             },
             'AuthorizedColumns': [
                 'string',
             ],
             'IsRegisteredWithLakeFormation': True|False,
             'CellFilters': [
                 {
                     'ColumnName': 'string',
                     'RowFilterExpression': 'string'
                 },
             ],
             'QueryAuthorizationId': 'string',
             'IsMultiDialectView': True|False,
             'ResourceArn': 'string',
             'IsProtected': True|False,
             'Permissions': [
                 'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
             ],
             'RowFilter': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Table** *(dict) --*

          A Table object containing the table metadata.

          * **Name** *(string) --*

            The table name. For Hive compatibility, this must be
            entirely lowercase.

          * **DatabaseName** *(string) --*

            The name of the database where the table metadata resides.
            For Hive compatibility, this must be all lowercase.

          * **Description** *(string) --*

            A description of the table.

          * **Owner** *(string) --*

            The owner of the table.

          * **CreateTime** *(datetime) --*

            The time when the table definition was created in the Data
            Catalog.

          * **UpdateTime** *(datetime) --*

            The last time that the table was updated.

          * **LastAccessTime** *(datetime) --*

            The last time that the table was accessed. This is usually
            taken from HDFS, and might not be reliable.

          * **LastAnalyzedTime** *(datetime) --*

            The last time that column statistics were computed for
            this table.

          * **Retention** *(integer) --*

            The retention time for this table.

          * **StorageDescriptor** *(dict) --*

            A storage descriptor containing information about the
            physical storage of this table.

            * **Columns** *(list) --*

              A list of the "Columns" in the table.

              * *(dict) --*

                A column in a "Table".

                * **Name** *(string) --*

                  The name of the "Column".

                * **Type** *(string) --*

                  The data type of the "Column".

                * **Comment** *(string) --*

                  A free-form text comment.

                * **Parameters** *(dict) --*

                  These key-value pairs define properties associated
                  with the column.

                  * *(string) --*

                    * *(string) --*

            * **Location** *(string) --*

              The physical location of the table. By default, this
              takes the form of the warehouse location, followed by
              the database location in the warehouse, followed by the
              table name.

            * **AdditionalLocations** *(list) --*

              A list of locations that point to the path where a Delta
              table is located.

              * *(string) --*

            * **InputFormat** *(string) --*

              The input format: "SequenceFileInputFormat" (binary), or
              "TextInputFormat", or a custom format.

            * **OutputFormat** *(string) --*

              The output format: "SequenceFileOutputFormat" (binary),
              or "IgnoreKeyTextOutputFormat", or a custom format.

            * **Compressed** *(boolean) --*

              "True" if the data in the table is compressed, or
              "False" if not.

            * **NumberOfBuckets** *(integer) --*

              Must be specified if the table contains any dimension
              columns.

            * **SerdeInfo** *(dict) --*

              The serialization/deserialization (SerDe) information.

              * **Name** *(string) --*

                Name of the SerDe.

              * **SerializationLibrary** *(string) --*

                Usually the class that implements the SerDe. An
                example is "org.apache.hadoop.hive.serde2.columnar.Co
                lumnarSerDe".

              * **Parameters** *(dict) --*

                These key-value pairs define initialization parameters
                for the SerDe.

                * *(string) --*

                  * *(string) --*

            * **BucketColumns** *(list) --*

              A list of reducer grouping columns, clustering columns,
              and bucketing columns in the table.

              * *(string) --*

            * **SortColumns** *(list) --*

              A list specifying the sort order of each bucket in the
              table.

              * *(dict) --*

                Specifies the sort order of a sorted column.

                * **Column** *(string) --*

                  The name of the column.

                * **SortOrder** *(integer) --*

                  Indicates that the column is sorted in ascending
                  order ( "== 1"), or in descending order ( "==0").

            * **Parameters** *(dict) --*

              The user-supplied properties in key-value form.

              * *(string) --*

                * *(string) --*

            * **SkewedInfo** *(dict) --*

              The information about values that appear frequently in a
              column (skewed values).

              * **SkewedColumnNames** *(list) --*

                A list of names of columns that contain skewed values.

                * *(string) --*

              * **SkewedColumnValues** *(list) --*

                A list of values that appear so frequently as to be
                considered skewed.

                * *(string) --*

              * **SkewedColumnValueLocationMaps** *(dict) --*

                A mapping of skewed values to the columns that contain
                them.

                * *(string) --*

                  * *(string) --*

            * **StoredAsSubDirectories** *(boolean) --*

              "True" if the table data is stored in subdirectories, or
              "False" if not.

            * **SchemaReference** *(dict) --*

              An object that references a schema stored in the Glue
              Schema Registry.

              When creating a table, you can pass an empty list of
              columns for the schema, and instead use a schema
              reference.

              * **SchemaId** *(dict) --*

                A structure that contains schema identity fields.
                Either this or the "SchemaVersionId" has to be
                provided.

                * **SchemaArn** *(string) --*

                  The Amazon Resource Name (ARN) of the schema. One of
                  "SchemaArn" or "SchemaName" has to be provided.

                * **SchemaName** *(string) --*

                  The name of the schema. One of "SchemaArn" or
                  "SchemaName" has to be provided.

                * **RegistryName** *(string) --*

                  The name of the schema registry that contains the
                  schema.

              * **SchemaVersionId** *(string) --*

                The unique ID assigned to a version of the schema.
                Either this or the "SchemaId" has to be provided.

              * **SchemaVersionNumber** *(integer) --*

                The version number of the schema.

          * **PartitionKeys** *(list) --*

            A list of columns by which the table is partitioned. Only
            primitive types are supported as partition keys.

            When you create a table used by Amazon Athena, and you do
            not specify any "partitionKeys", you must at least set the
            value of "partitionKeys" to an empty list. For example:

            ""PartitionKeys": []"

            * *(dict) --*

              A column in a "Table".

              * **Name** *(string) --*

                The name of the "Column".

              * **Type** *(string) --*

                The data type of the "Column".

              * **Comment** *(string) --*

                A free-form text comment.

              * **Parameters** *(dict) --*

                These key-value pairs define properties associated
                with the column.

                * *(string) --*

                  * *(string) --*

          * **ViewOriginalText** *(string) --*

            Included for Apache Hive compatibility. Not used in the
            normal course of Glue operations. If the table is a
            "VIRTUAL_VIEW", certain Athena configuration encoded in
            base64.

          * **ViewExpandedText** *(string) --*

            Included for Apache Hive compatibility. Not used in the
            normal course of Glue operations.

          * **TableType** *(string) --*

            The type of this table. Glue will create tables with the
            "EXTERNAL_TABLE" type. Other services, such as Athena, may
            create tables with additional table types.

            Glue related table types:

               EXTERNAL_TABLE

            Hive compatible attribute - indicates a non-Hive managed
            table.

               GOVERNED

            Used by Lake Formation. The Glue Data Catalog understands
            "GOVERNED".

          * **Parameters** *(dict) --*

            These key-value pairs define properties associated with
            the table.

            * *(string) --*

              * *(string) --*

          * **CreatedBy** *(string) --*

            The person or entity who created the table.

          * **IsRegisteredWithLakeFormation** *(boolean) --*

            Indicates whether the table has been registered with Lake
            Formation.

          * **TargetTable** *(dict) --*

            A "TableIdentifier" structure that describes a target
            table for resource linking.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the table resides.

            * **DatabaseName** *(string) --*

              The name of the catalog database that contains the
              target table.

            * **Name** *(string) --*

              The name of the target table.

            * **Region** *(string) --*

              Region of the target table.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the table resides.

          * **VersionId** *(string) --*

            The ID of the table version.

          * **FederatedTable** *(dict) --*

            A "FederatedTable" structure that references an entity
            outside the Glue Data Catalog.

            * **Identifier** *(string) --*

              A unique identifier for the federated table.

            * **DatabaseIdentifier** *(string) --*

              A unique identifier for the federated database.

            * **ConnectionName** *(string) --*

              The name of the connection to the external metastore.

            * **ConnectionType** *(string) --*

              The type of connection used to access the federated
              table, specifying the protocol or method for connecting
              to the external data source.

          * **ViewDefinition** *(dict) --*

            A structure that contains all the information that defines
            the view, including the dialect or dialects for the view,
            and the query.

            * **IsProtected** *(boolean) --*

              You can set this flag as true to instruct the engine not
              to push user-provided operations into the logical plan
              of the view during query planning. However, setting this
              flag does not guarantee that the engine will comply.
              Refer to the engine's documentation to understand the
              guarantees provided, if any.

            * **Definer** *(string) --*

              The definer of a view in SQL.

            * **SubObjects** *(list) --*

              A list of table Amazon Resource Names (ARNs).

              * *(string) --*

            * **Representations** *(list) --*

              A list of representations.

              * *(dict) --*

                A structure that contains the dialect of the view, and
                the query that defines the view.

                * **Dialect** *(string) --*

                  The dialect of the query engine.

                * **DialectVersion** *(string) --*

                  The version of the dialect of the query engine. For
                  example, 3.0.0.

                * **ViewOriginalText** *(string) --*

                  The "SELECT" query provided by the customer during
                  "CREATE VIEW DDL". This SQL is not used during a
                  query on a view ( "ViewExpandedText" is used
                  instead). "ViewOriginalText" is used for cases like
                  "SHOW CREATE VIEW" where users want to see the
                  original DDL command that created the view.

                * **ViewExpandedText** *(string) --*

                  The expanded SQL for the view. This SQL is used by
                  engines while processing a query on a view. Engines
                  may perform operations during view creation to
                  transform "ViewOriginalText" to "ViewExpandedText".
                  For example:

                  * Fully qualified identifiers: "SELECT * from table1
                    -> SELECT * from db1.table1"

                * **ValidationConnection** *(string) --*

                  The name of the connection to be used to validate
                  the specific representation of the view.

                * **IsStale** *(boolean) --*

                  Dialects marked as stale are no longer valid and
                  must be updated before they can be queried in their
                  respective query engines.

          * **IsMultiDialectView** *(boolean) --*

            Specifies whether the view supports the SQL dialects of
            one or more different query engines and can therefore be
            read by those engines.

          * **Status** *(dict) --*

            A structure containing information about the state of an
            asynchronous change to a table.

            * **RequestedBy** *(string) --*

              The ARN of the user who requested the asynchronous
              change.

            * **UpdatedBy** *(string) --*

              The ARN of the user to last manually alter the
              asynchronous change (requesting cancellation, etc).

            * **RequestTime** *(datetime) --*

              An ISO 8601 formatted date string indicating the time
              that the change was initiated.

            * **UpdateTime** *(datetime) --*

              An ISO 8601 formatted date string indicating the time
              that the state was last updated.

            * **Action** *(string) --*

              Indicates which action was called on the table,
              currently only "CREATE" or "UPDATE".

            * **State** *(string) --*

              A generic status for the change in progress, such as
              QUEUED, IN_PROGRESS, SUCCESS, or FAILED.

            * **Error** *(dict) --*

              An error that will only appear when the state is
              "FAILED". This is a parent level exception message,
              there may be different >>``<<Error``s for each dialect.

              * **ErrorCode** *(string) --*

                The code associated with this error.

              * **ErrorMessage** *(string) --*

                A message describing the error.

            * **Details** *(dict) --*

              A "StatusDetails" object with information about the
              requested change.

              * **RequestedChange** *(dict) --*

                A "Table" object representing the requested changes.

              * **ViewValidations** *(list) --*

                A list of "ViewValidation" objects that contain
                information for an analytical engine to validate a
                view.

                * *(dict) --*

                  A structure that contains information for an
                  analytical engine to validate a view, prior to
                  persisting the view metadata. Used in the case of
                  direct "UpdateTable" or "CreateTable" API calls.

                  * **Dialect** *(string) --*

                    The dialect of the query engine.

                  * **DialectVersion** *(string) --*

                    The version of the dialect of the query engine.
                    For example, 3.0.0.

                  * **ViewValidationText** *(string) --*

                    The "SELECT" query that defines the view, as
                    provided by the customer.

                  * **UpdateTime** *(datetime) --*

                    The time of the last update.

                  * **State** *(string) --*

                    The state of the validation.

                  * **Error** *(dict) --*

                    An error associated with the validation.

                    * **ErrorCode** *(string) --*

                      The code associated with this error.

                    * **ErrorMessage** *(string) --*

                      A message describing the error.

        * **AuthorizedColumns** *(list) --*

          A list of column names that the user has been granted access
          to.

          * *(string) --*

        * **IsRegisteredWithLakeFormation** *(boolean) --*

          A Boolean value that indicates whether the partition
          location is registered with Lake Formation.

        * **CellFilters** *(list) --*

          A list of column row filters.

          * *(dict) --*

            A filter that uses both column-level and row-level
            filtering.

            * **ColumnName** *(string) --*

              A string containing the name of the column.

            * **RowFilterExpression** *(string) --*

              A string containing the row-level filter expression.

        * **QueryAuthorizationId** *(string) --*

          A cryptographically generated query identifier generated by
          Glue or Lake Formation.

        * **IsMultiDialectView** *(boolean) --*

          Specifies whether the view supports the SQL dialects of one
          or more different query engines and can therefore be read by
          those engines.

        * **ResourceArn** *(string) --*

          The resource ARN of the parent resource extracted from the
          request.

        * **IsProtected** *(boolean) --*

          A flag that instructs the engine not to push user-provided
          operations into the logical plan of the view during query
          planning. However, if set this flag does not guarantee that
          the engine will comply. Refer to the engine's documentation
          to understand the guarantees provided, if any.

        * **Permissions** *(list) --*

          The Lake Formation data permissions of the caller on the
          table. Used to authorize the call when no view context is
          found.

          * *(string) --*

        * **RowFilter** *(string) --*

          The filter that applies to the table. For example when
          applying the filter in SQL, it would go in the "WHERE"
          clause and can be evaluated by using an "AND" operator with
          any other predicates applied by the user querying the table.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.PermissionTypeMismatchException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / start_export_labels_task_run


start_export_labels_task_run
****************************

Glue.Client.start_export_labels_task_run(**kwargs)

   Begins an asynchronous task to export all labeled data for a
   particular transform. This task is the only label-related API call
   that is not part of the typical active learning workflow. You
   typically use "StartExportLabelsTaskRun" when you want to work with
   all of your existing labels at the same time, such as when you want
   to remove or change labels that were previously submitted as truth.
   This API operation accepts the "TransformId" whose labels you want
   to export and an Amazon Simple Storage Service (Amazon S3) path to
   export the labels to. The operation returns a "TaskRunId". You can
   check on the status of your task run by calling the "GetMLTaskRun"
   API.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_export_labels_task_run(
          TransformId='string',
          OutputS3Path='string'
      )

   Parameters:
      * **TransformId** (*string*) --

        **[REQUIRED]**

        The unique identifier of the machine learning transform.

      * **OutputS3Path** (*string*) --

        **[REQUIRED]**

        The Amazon S3 path where you export the labels.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TaskRunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TaskRunId** *(string) --*

          The unique identifier for the task run.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / delete_partition


delete_partition
****************

Glue.Client.delete_partition(**kwargs)

   Deletes a specified partition.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionValues=[
              'string',
          ]
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partition to be deleted resides. If none is provided, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database in which the table in
        question resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table that contains the partition to be
        deleted.

      * **PartitionValues** (*list*) --

        **[REQUIRED]**

        The values that define the partition.

        * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / delete_column_statistics_for_partition


delete_column_statistics_for_partition
**************************************

Glue.Client.delete_column_statistics_for_partition(**kwargs)

   Delete the partition column statistics of a column.

   The Identity and Access Management (IAM) permission required for
   this operation is "DeletePartition".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_column_statistics_for_partition(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          PartitionValues=[
              'string',
          ],
          ColumnName='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partitions in question reside. If none is supplied, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partitions reside.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the partitions' table.

      * **PartitionValues** (*list*) --

        **[REQUIRED]**

        A list of partition values identifying the partition.

        * *(string) --*

      * **ColumnName** (*string*) --

        **[REQUIRED]**

        Name of the column.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_database


get_database
************

Glue.Client.get_database(**kwargs)

   Retrieves the definition of a specified database.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_database(
          CatalogId='string',
          Name='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which the database resides. If none is provided, the Amazon
        Web Services account ID is used by default.

      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the database to retrieve. For Hive compatibility,
        this should be all lowercase.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Database': {
                 'Name': 'string',
                 'Description': 'string',
                 'LocationUri': 'string',
                 'Parameters': {
                     'string': 'string'
                 },
                 'CreateTime': datetime(2015, 1, 1),
                 'CreateTableDefaultPermissions': [
                     {
                         'Principal': {
                             'DataLakePrincipalIdentifier': 'string'
                         },
                         'Permissions': [
                             'ALL'|'SELECT'|'ALTER'|'DROP'|'DELETE'|'INSERT'|'CREATE_DATABASE'|'CREATE_TABLE'|'DATA_LOCATION_ACCESS',
                         ]
                     },
                 ],
                 'TargetDatabase': {
                     'CatalogId': 'string',
                     'DatabaseName': 'string',
                     'Region': 'string'
                 },
                 'CatalogId': 'string',
                 'FederatedDatabase': {
                     'Identifier': 'string',
                     'ConnectionName': 'string',
                     'ConnectionType': 'string'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Database** *(dict) --*

          The definition of the specified database in the Data
          Catalog.

          * **Name** *(string) --*

            The name of the database. For Hive compatibility, this is
            folded to lowercase when it is stored.

          * **Description** *(string) --*

            A description of the database.

          * **LocationUri** *(string) --*

            The location of the database (for example, an HDFS path).

          * **Parameters** *(dict) --*

            These key-value pairs define parameters and properties of
            the database.

            * *(string) --*

              * *(string) --*

          * **CreateTime** *(datetime) --*

            The time at which the metadata database was created in the
            catalog.

          * **CreateTableDefaultPermissions** *(list) --*

            Creates a set of default permissions on the table for
            principals. Used by Lake Formation. Not used in the normal
            course of Glue operations.

            * *(dict) --*

              Permissions granted to a principal.

              * **Principal** *(dict) --*

                The principal who is granted permissions.

                * **DataLakePrincipalIdentifier** *(string) --*

                  An identifier for the Lake Formation principal.

              * **Permissions** *(list) --*

                The permissions that are granted to the principal.

                * *(string) --*

          * **TargetDatabase** *(dict) --*

            A "DatabaseIdentifier" structure that describes a target
            database for resource linking.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the database
              resides.

            * **DatabaseName** *(string) --*

              The name of the catalog database.

            * **Region** *(string) --*

              Region of the target database.

          * **CatalogId** *(string) --*

            The ID of the Data Catalog in which the database resides.

          * **FederatedDatabase** *(dict) --*

            A "FederatedDatabase" structure that references an entity
            outside the Glue Data Catalog.

            * **Identifier** *(string) --*

              A unique identifier for the federated database.

            * **ConnectionName** *(string) --*

              The name of the connection to the external metastore.

            * **ConnectionType** *(string) --*

              The type of connection used to access the federated
              database, such as JDBC, ODBC, or other supported
              connection protocols.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.FederationSourceRetryableException"
Glue / Client / get_integration_table_properties


get_integration_table_properties
********************************

Glue.Client.get_integration_table_properties(**kwargs)

   This API is used to retrieve optional override properties for the
   tables that need to be replicated. These properties can include
   properties for filtering and partition for source and target
   tables.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_integration_table_properties(
          ResourceArn='string',
          TableName='string'
      )

   Parameters:
      * **ResourceArn** (*string*) --

        **[REQUIRED]**

        The Amazon Resource Name (ARN) of the target table for which
        to retrieve integration table properties. Currently, this API
        only supports retrieving properties for target tables, and the
        provided ARN should be the ARN of the target table in the Glue
        Data Catalog. Support for retrieving integration table
        properties for source connections (using the connection ARN)
        is not yet implemented and will be added in a future release.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table to be replicated.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'ResourceArn': 'string',
             'TableName': 'string',
             'SourceTableConfig': {
                 'Fields': [
                     'string',
                 ],
                 'FilterPredicate': 'string',
                 'PrimaryKey': [
                     'string',
                 ],
                 'RecordUpdateField': 'string'
             },
             'TargetTableConfig': {
                 'UnnestSpec': 'TOPLEVEL'|'FULL'|'NOUNNEST',
                 'PartitionSpec': [
                     {
                         'FieldName': 'string',
                         'FunctionSpec': 'string',
                         'ConversionSpec': 'string'
                     },
                 ],
                 'TargetTableName': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **ResourceArn** *(string) --*

          The Amazon Resource Name (ARN) of the target table for which
          to retrieve integration table properties. Currently, this
          API only supports retrieving properties for target tables,
          and the provided ARN should be the ARN of the target table
          in the Glue Data Catalog. Support for retrieving integration
          table properties for source connections (using the
          connection ARN) is not yet implemented and will be added in
          a future release.

        * **TableName** *(string) --*

          The name of the table to be replicated.

        * **SourceTableConfig** *(dict) --*

          A structure for the source table configuration.

          * **Fields** *(list) --*

            A list of fields used for column-level filtering.
            Currently unsupported.

            * *(string) --*

          * **FilterPredicate** *(string) --*

            A condition clause used for row-level filtering. Currently
            unsupported.

          * **PrimaryKey** *(list) --*

            Provide the primary key set for this table. Currently
            supported specifically for SAP "EntityOf" entities upon
            request. Contact Amazon Web Services Support to make this
            feature available.

            * *(string) --*

          * **RecordUpdateField** *(string) --*

            Incremental pull timestamp-based field. Currently
            unsupported.

        * **TargetTableConfig** *(dict) --*

          A structure for the target table configuration.

          * **UnnestSpec** *(string) --*

            Specifies how nested objects are flattened to top-level
            elements. Valid values are: "TOPLEVEL", "FULL", or
            "NOUNNEST".

          * **PartitionSpec** *(list) --*

            Determines the file layout on the target.

            * *(dict) --*

              A structure that describes how data is partitioned on
              the target.

              * **FieldName** *(string) --*

                The field name used to partition data on the target.
                Avoid using columns that have unique values for each
                row (for example, *LastModifiedTimestamp*,
                *SystemModTimeStamp*) as the partition column. These
                columns are not suitable for partitioning because they
                create a large number of small partitions, which can
                lead to performance issues.

              * **FunctionSpec** *(string) --*

                Specifies the function used to partition data on the
                target. The only accepted value for this parameter is
                *'identity'* (string). The *'identity'* function
                ensures that the data partitioning on the target
                follows the same scheme as the source. In other words,
                the partitioning structure of the source data is
                preserved in the target destination.

              * **ConversionSpec** *(string) --*

                Specifies the timestamp format of the source data.
                Valid values are:

                * "epoch_sec" - Unix epoch timestamp in seconds

                * "epoch_milli" - Unix epoch timestamp in milliseconds

                * "iso" - ISO 8601 formatted timestamp

                Note:

                  Only specify "ConversionSpec" when using timestamp-
                  based partition functions (year, month, day, or
                  hour). Glue Zero-ETL uses this parameter to
                  correctly transform source data into timestamp
                  format before partitioning.Do not use high-
                  cardinality columns with the "identity" partition
                  function. High-cardinality columns include:

                  * Primary keys

                  * Timestamp fields (such as "LastModifiedTimestamp",
                    "CreatedDate")

                  * System-generated timestamps

                  Using high-cardinality columns with identity
                  partitioning creates many small partitions, which
                  can significantly degrade ingestion performance.

          * **TargetTableName** *(string) --*

            The optional name of a target table.

   **Exceptions**

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.ResourceNotFoundException"

   * "Glue.Client.exceptions.InternalServerException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / tag_resource


tag_resource
************

Glue.Client.tag_resource(**kwargs)

   Adds tags to a resource. A tag is a label you can assign to an
   Amazon Web Services resource. In Glue, you can tag only certain
   resources. For information about what resources you can tag, see
   Amazon Web Services Tags in Glue.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.tag_resource(
          ResourceArn='string',
          TagsToAdd={
              'string': 'string'
          }
      )

   Parameters:
      * **ResourceArn** (*string*) --

        **[REQUIRED]**

        The ARN of the Glue resource to which to add the tags. For
        more information about Glue resource ARNs, see the Glue ARN
        string pattern.

      * **TagsToAdd** (*dict*) --

        **[REQUIRED]**

        Tags to add to this resource.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.EntityNotFoundException"
Glue / Client / create_table_optimizer


create_table_optimizer
**********************

Glue.Client.create_table_optimizer(**kwargs)

   Creates a new table optimizer for a specific function.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_table_optimizer(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          Type='compaction'|'retention'|'orphan_file_deletion',
          TableOptimizerConfiguration={
              'roleArn': 'string',
              'enabled': True|False,
              'vpcConfiguration': {
                  'glueConnectionName': 'string'
              },
              'compactionConfiguration': {
                  'icebergConfiguration': {
                      'strategy': 'binpack'|'sort'|'z-order',
                      'minInputFiles': 123,
                      'deleteFileThreshold': 123
                  }
              },
              'retentionConfiguration': {
                  'icebergConfiguration': {
                      'snapshotRetentionPeriodInDays': 123,
                      'numberOfSnapshotsToRetain': 123,
                      'cleanExpiredFiles': True|False,
                      'runRateInHours': 123
                  }
              },
              'orphanFileDeletionConfiguration': {
                  'icebergConfiguration': {
                      'orphanFileRetentionPeriodInDays': 123,
                      'location': 'string',
                      'runRateInHours': 123
                  }
              }
          }
      )

   Parameters:
      * **CatalogId** (*string*) --

        **[REQUIRED]**

        The Catalog ID of the table.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database in the catalog in which the table
        resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table.

      * **Type** (*string*) --

        **[REQUIRED]**

        The type of table optimizer.

      * **TableOptimizerConfiguration** (*dict*) --

        **[REQUIRED]**

        A "TableOptimizerConfiguration" object representing the
        configuration of a table optimizer.

        * **roleArn** *(string) --*

          A role passed by the caller which gives the service
          permission to update the resources associated with the
          optimizer on the caller's behalf.

        * **enabled** *(boolean) --*

          Whether table optimization is enabled.

        * **vpcConfiguration** *(dict) --*

          A "TableOptimizerVpcConfiguration" object representing the
          VPC configuration for a table optimizer.

          This configuration is necessary to perform optimization on
          tables that are in a customer VPC.

          Note:

            This is a Tagged Union structure. Only one of the
            following top level keys can be set: "glueConnectionName".

          * **glueConnectionName** *(string) --*

            The name of the Glue connection used for the VPC for the
            table optimizer.

        * **compactionConfiguration** *(dict) --*

          The configuration for a compaction optimizer. This
          configuration defines how data files in your table will be
          compacted to improve query performance and reduce storage
          costs.

          * **icebergConfiguration** *(dict) --*

            The configuration for an Iceberg compaction optimizer.

            * **strategy** *(string) --*

              The strategy to use for compaction. Valid values are:

              * "binpack": Combines small files into larger files,
                typically targeting sizes over 100MB, while applying
                any pending deletes. This is the recommended
                compaction strategy for most use cases.

              * "sort": Organizes data based on specified columns
                which are sorted hierarchically during compaction,
                improving query performance for filtered operations.
                This strategy is recommended when your queries
                frequently filter on specific columns. To use this
                strategy, you must first define a sort order in your
                Iceberg table properties using the "sort_order" table
                property.

              * "z-order": Optimizes data organization by blending
                multiple attributes into a single scalar value that
                can be used for sorting, allowing efficient querying
                across multiple dimensions. This strategy is
                recommended when you need to query data across
                multiple dimensions simultaneously. To use this
                strategy, you must first define a sort order in your
                Iceberg table properties using the "sort_order" table
                property.

              If an input is not provided, the default value 'binpack'
              will be used.

            * **minInputFiles** *(integer) --*

              The minimum number of data files that must be present in
              a partition before compaction will actually compact
              files. This parameter helps control when compaction is
              triggered, preventing unnecessary compaction operations
              on partitions with few files. If an input is not
              provided, the default value 100 will be used.

            * **deleteFileThreshold** *(integer) --*

              The minimum number of deletes that must be present in a
              data file to make it eligible for compaction. This
              parameter helps optimize compaction by focusing on files
              that contain a significant number of delete operations,
              which can improve query performance by removing deleted
              records. If an input is not provided, the default value
              1 will be used.

        * **retentionConfiguration** *(dict) --*

          The configuration for a snapshot retention optimizer.

          * **icebergConfiguration** *(dict) --*

            The configuration for an Iceberg snapshot retention
            optimizer.

            * **snapshotRetentionPeriodInDays** *(integer) --*

              The number of days to retain the Iceberg snapshots. If
              an input is not provided, the corresponding Iceberg
              table configuration field will be used or if not
              present, the default value 5 will be used.

            * **numberOfSnapshotsToRetain** *(integer) --*

              The number of Iceberg snapshots to retain within the
              retention period. If an input is not provided, the
              corresponding Iceberg table configuration field will be
              used or if not present, the default value 1 will be
              used.

            * **cleanExpiredFiles** *(boolean) --*

              If set to false, snapshots are only deleted from table
              metadata, and the underlying data and metadata files are
              not deleted.

            * **runRateInHours** *(integer) --*

              The interval in hours between retention job runs. This
              parameter controls how frequently the retention
              optimizer will run to clean up expired snapshots. The
              value must be between 3 and 168 hours (7 days). If an
              input is not provided, the default value 24 will be
              used.

        * **orphanFileDeletionConfiguration** *(dict) --*

          The configuration for an orphan file deletion optimizer.

          * **icebergConfiguration** *(dict) --*

            The configuration for an Iceberg orphan file deletion
            optimizer.

            * **orphanFileRetentionPeriodInDays** *(integer) --*

              The number of days that orphan files should be retained
              before file deletion. If an input is not provided, the
              default value 3 will be used.

            * **location** *(string) --*

              Specifies a directory in which to look for files
              (defaults to the table's location). You may choose a
              sub-directory rather than the top-level table location.

            * **runRateInHours** *(integer) --*

              The interval in hours between orphan file deletion job
              runs. This parameter controls how frequently the orphan
              file deletion optimizer will run to clean up orphan
              files. The value must be between 3 and 168 hours (7
              days). If an input is not provided, the default value 24
              will be used.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ThrottlingException"
Glue / Client / create_usage_profile


create_usage_profile
********************

Glue.Client.create_usage_profile(**kwargs)

   Creates an Glue usage profile.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_usage_profile(
          Name='string',
          Description='string',
          Configuration={
              'SessionConfiguration': {
                  'string': {
                      'DefaultValue': 'string',
                      'AllowedValues': [
                          'string',
                      ],
                      'MinValue': 'string',
                      'MaxValue': 'string'
                  }
              },
              'JobConfiguration': {
                  'string': {
                      'DefaultValue': 'string',
                      'AllowedValues': [
                          'string',
                      ],
                      'MinValue': 'string',
                      'MaxValue': 'string'
                  }
              }
          },
          Tags={
              'string': 'string'
          }
      )

   Parameters:
      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the usage profile.

      * **Description** (*string*) -- A description of the usage
        profile.

      * **Configuration** (*dict*) --

        **[REQUIRED]**

        A "ProfileConfiguration" object specifying the job and session
        values for the profile.

        * **SessionConfiguration** *(dict) --*

          A key-value map of configuration parameters for Glue
          sessions.

          * *(string) --*

            * *(dict) --*

              Specifies the values that an admin sets for each job or
              session parameter configured in a Glue usage profile.

              * **DefaultValue** *(string) --*

                A default value for the parameter.

              * **AllowedValues** *(list) --*

                A list of allowed values for the parameter.

                * *(string) --*

              * **MinValue** *(string) --*

                A minimum allowed value for the parameter.

              * **MaxValue** *(string) --*

                A maximum allowed value for the parameter.

        * **JobConfiguration** *(dict) --*

          A key-value map of configuration parameters for Glue jobs.

          * *(string) --*

            * *(dict) --*

              Specifies the values that an admin sets for each job or
              session parameter configured in a Glue usage profile.

              * **DefaultValue** *(string) --*

                A default value for the parameter.

              * **AllowedValues** *(list) --*

                A list of allowed values for the parameter.

                * *(string) --*

              * **MinValue** *(string) --*

                A minimum allowed value for the parameter.

              * **MaxValue** *(string) --*

                A maximum allowed value for the parameter.

      * **Tags** (*dict*) --

        A list of tags applied to the usage profile.

        * *(string) --*

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Name': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Name** *(string) --*

          The name of the usage profile that was created.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.OperationNotSupportedException"
Glue / Client / get_crawler


get_crawler
***********

Glue.Client.get_crawler(**kwargs)

   Retrieves metadata for a specified crawler.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_crawler(
          Name='string'
      )

   Parameters:
      **Name** (*string*) --

      **[REQUIRED]**

      The name of the crawler to retrieve metadata for.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Crawler': {
                 'Name': 'string',
                 'Role': 'string',
                 'Targets': {
                     'S3Targets': [
                         {
                             'Path': 'string',
                             'Exclusions': [
                                 'string',
                             ],
                             'ConnectionName': 'string',
                             'SampleSize': 123,
                             'EventQueueArn': 'string',
                             'DlqEventQueueArn': 'string'
                         },
                     ],
                     'JdbcTargets': [
                         {
                             'ConnectionName': 'string',
                             'Path': 'string',
                             'Exclusions': [
                                 'string',
                             ],
                             'EnableAdditionalMetadata': [
                                 'COMMENTS'|'RAWTYPES',
                             ]
                         },
                     ],
                     'MongoDBTargets': [
                         {
                             'ConnectionName': 'string',
                             'Path': 'string',
                             'ScanAll': True|False
                         },
                     ],
                     'DynamoDBTargets': [
                         {
                             'Path': 'string',
                             'scanAll': True|False,
                             'scanRate': 123.0
                         },
                     ],
                     'CatalogTargets': [
                         {
                             'DatabaseName': 'string',
                             'Tables': [
                                 'string',
                             ],
                             'ConnectionName': 'string',
                             'EventQueueArn': 'string',
                             'DlqEventQueueArn': 'string'
                         },
                     ],
                     'DeltaTargets': [
                         {
                             'DeltaTables': [
                                 'string',
                             ],
                             'ConnectionName': 'string',
                             'WriteManifest': True|False,
                             'CreateNativeDeltaTable': True|False
                         },
                     ],
                     'IcebergTargets': [
                         {
                             'Paths': [
                                 'string',
                             ],
                             'ConnectionName': 'string',
                             'Exclusions': [
                                 'string',
                             ],
                             'MaximumTraversalDepth': 123
                         },
                     ],
                     'HudiTargets': [
                         {
                             'Paths': [
                                 'string',
                             ],
                             'ConnectionName': 'string',
                             'Exclusions': [
                                 'string',
                             ],
                             'MaximumTraversalDepth': 123
                         },
                     ]
                 },
                 'DatabaseName': 'string',
                 'Description': 'string',
                 'Classifiers': [
                     'string',
                 ],
                 'RecrawlPolicy': {
                     'RecrawlBehavior': 'CRAWL_EVERYTHING'|'CRAWL_NEW_FOLDERS_ONLY'|'CRAWL_EVENT_MODE'
                 },
                 'SchemaChangePolicy': {
                     'UpdateBehavior': 'LOG'|'UPDATE_IN_DATABASE',
                     'DeleteBehavior': 'LOG'|'DELETE_FROM_DATABASE'|'DEPRECATE_IN_DATABASE'
                 },
                 'LineageConfiguration': {
                     'CrawlerLineageSettings': 'ENABLE'|'DISABLE'
                 },
                 'State': 'READY'|'RUNNING'|'STOPPING',
                 'TablePrefix': 'string',
                 'Schedule': {
                     'ScheduleExpression': 'string',
                     'State': 'SCHEDULED'|'NOT_SCHEDULED'|'TRANSITIONING'
                 },
                 'CrawlElapsedTime': 123,
                 'CreationTime': datetime(2015, 1, 1),
                 'LastUpdated': datetime(2015, 1, 1),
                 'LastCrawl': {
                     'Status': 'SUCCEEDED'|'CANCELLED'|'FAILED',
                     'ErrorMessage': 'string',
                     'LogGroup': 'string',
                     'LogStream': 'string',
                     'MessagePrefix': 'string',
                     'StartTime': datetime(2015, 1, 1)
                 },
                 'Version': 123,
                 'Configuration': 'string',
                 'CrawlerSecurityConfiguration': 'string',
                 'LakeFormationConfiguration': {
                     'UseLakeFormationCredentials': True|False,
                     'AccountId': 'string'
                 }
             }
         }

      **Response Structure**

      * *(dict) --*

        * **Crawler** *(dict) --*

          The metadata for the specified crawler.

          * **Name** *(string) --*

            The name of the crawler.

          * **Role** *(string) --*

            The Amazon Resource Name (ARN) of an IAM role that's used
            to access customer resources, such as Amazon Simple
            Storage Service (Amazon S3) data.

          * **Targets** *(dict) --*

            A collection of targets to crawl.

            * **S3Targets** *(list) --*

              Specifies Amazon Simple Storage Service (Amazon S3)
              targets.

              * *(dict) --*

                Specifies a data store in Amazon Simple Storage
                Service (Amazon S3).

                * **Path** *(string) --*

                  The path to the Amazon S3 target.

                * **Exclusions** *(list) --*

                  A list of glob patterns used to exclude from the
                  crawl. For more information, see Catalog Tables with
                  a Crawler.

                  * *(string) --*

                * **ConnectionName** *(string) --*

                  The name of a connection which allows a job or
                  crawler to access data in Amazon S3 within an Amazon
                  Virtual Private Cloud environment (Amazon VPC).

                * **SampleSize** *(integer) --*

                  Sets the number of files in each leaf folder to be
                  crawled when crawling sample files in a dataset. If
                  not set, all the files are crawled. A valid value is
                  an integer between 1 and 249.

                * **EventQueueArn** *(string) --*

                  A valid Amazon SQS ARN. For example,
                  "arn:aws:sqs:region:account:sqs".

                * **DlqEventQueueArn** *(string) --*

                  A valid Amazon dead-letter SQS ARN. For example,
                  "arn:aws:sqs:region:account:deadLetterQueue".

            * **JdbcTargets** *(list) --*

              Specifies JDBC targets.

              * *(dict) --*

                Specifies a JDBC data store to crawl.

                * **ConnectionName** *(string) --*

                  The name of the connection to use to connect to the
                  JDBC target.

                * **Path** *(string) --*

                  The path of the JDBC target.

                * **Exclusions** *(list) --*

                  A list of glob patterns used to exclude from the
                  crawl. For more information, see Catalog Tables with
                  a Crawler.

                  * *(string) --*

                * **EnableAdditionalMetadata** *(list) --*

                  Specify a value of "RAWTYPES" or "COMMENTS" to
                  enable additional metadata in table responses.
                  "RAWTYPES" provides the native-level datatype.
                  "COMMENTS" provides comments associated with a
                  column or table in the database.

                  If you do not need additional metadata, keep the
                  field empty.

                  * *(string) --*

            * **MongoDBTargets** *(list) --*

              Specifies Amazon DocumentDB or MongoDB targets.

              * *(dict) --*

                Specifies an Amazon DocumentDB or MongoDB data store
                to crawl.

                * **ConnectionName** *(string) --*

                  The name of the connection to use to connect to the
                  Amazon DocumentDB or MongoDB target.

                * **Path** *(string) --*

                  The path of the Amazon DocumentDB or MongoDB target
                  (database/collection).

                * **ScanAll** *(boolean) --*

                  Indicates whether to scan all the records, or to
                  sample rows from the table. Scanning all the records
                  can take a long time when the table is not a high
                  throughput table.

                  A value of "true" means to scan all records, while a
                  value of "false" means to sample the records. If no
                  value is specified, the value defaults to "true".

            * **DynamoDBTargets** *(list) --*

              Specifies Amazon DynamoDB targets.

              * *(dict) --*

                Specifies an Amazon DynamoDB table to crawl.

                * **Path** *(string) --*

                  The name of the DynamoDB table to crawl.

                * **scanAll** *(boolean) --*

                  Indicates whether to scan all the records, or to
                  sample rows from the table. Scanning all the records
                  can take a long time when the table is not a high
                  throughput table.

                  A value of "true" means to scan all records, while a
                  value of "false" means to sample the records. If no
                  value is specified, the value defaults to "true".

                * **scanRate** *(float) --*

                  The percentage of the configured read capacity units
                  to use by the Glue crawler. Read capacity units is a
                  term defined by DynamoDB, and is a numeric value
                  that acts as rate limiter for the number of reads
                  that can be performed on that table per second.

                  The valid values are null or a value between 0.1 to
                  1.5. A null value is used when user does not provide
                  a value, and defaults to 0.5 of the configured Read
                  Capacity Unit (for provisioned tables), or 0.25 of
                  the max configured Read Capacity Unit (for tables
                  using on-demand mode).

            * **CatalogTargets** *(list) --*

              Specifies Glue Data Catalog targets.

              * *(dict) --*

                Specifies an Glue Data Catalog target.

                * **DatabaseName** *(string) --*

                  The name of the database to be synchronized.

                * **Tables** *(list) --*

                  A list of the tables to be synchronized.

                  * *(string) --*

                * **ConnectionName** *(string) --*

                  The name of the connection for an Amazon S3-backed
                  Data Catalog table to be a target of the crawl when
                  using a "Catalog" connection type paired with a
                  "NETWORK" Connection type.

                * **EventQueueArn** *(string) --*

                  A valid Amazon SQS ARN. For example,
                  "arn:aws:sqs:region:account:sqs".

                * **DlqEventQueueArn** *(string) --*

                  A valid Amazon dead-letter SQS ARN. For example,
                  "arn:aws:sqs:region:account:deadLetterQueue".

            * **DeltaTargets** *(list) --*

              Specifies Delta data store targets.

              * *(dict) --*

                Specifies a Delta data store to crawl one or more
                Delta tables.

                * **DeltaTables** *(list) --*

                  A list of the Amazon S3 paths to the Delta tables.

                  * *(string) --*

                * **ConnectionName** *(string) --*

                  The name of the connection to use to connect to the
                  Delta table target.

                * **WriteManifest** *(boolean) --*

                  Specifies whether to write the manifest files to the
                  Delta table path.

                * **CreateNativeDeltaTable** *(boolean) --*

                  Specifies whether the crawler will create native
                  tables, to allow integration with query engines that
                  support querying of the Delta transaction log
                  directly.

            * **IcebergTargets** *(list) --*

              Specifies Apache Iceberg data store targets.

              * *(dict) --*

                Specifies an Apache Iceberg data source where Iceberg
                tables are stored in Amazon S3.

                * **Paths** *(list) --*

                  One or more Amazon S3 paths that contains Iceberg
                  metadata folders as "s3://bucket/prefix".

                  * *(string) --*

                * **ConnectionName** *(string) --*

                  The name of the connection to use to connect to the
                  Iceberg target.

                * **Exclusions** *(list) --*

                  A list of glob patterns used to exclude from the
                  crawl. For more information, see Catalog Tables with
                  a Crawler.

                  * *(string) --*

                * **MaximumTraversalDepth** *(integer) --*

                  The maximum depth of Amazon S3 paths that the
                  crawler can traverse to discover the Iceberg
                  metadata folder in your Amazon S3 path. Used to
                  limit the crawler run time.

            * **HudiTargets** *(list) --*

              Specifies Apache Hudi data store targets.

              * *(dict) --*

                Specifies an Apache Hudi data source.

                * **Paths** *(list) --*

                  An array of Amazon S3 location strings for Hudi,
                  each indicating the root folder with which the
                  metadata files for a Hudi table resides. The Hudi
                  folder may be located in a child folder of the root
                  folder.

                  The crawler will scan all folders underneath a path
                  for a Hudi folder.

                  * *(string) --*

                * **ConnectionName** *(string) --*

                  The name of the connection to use to connect to the
                  Hudi target. If your Hudi files are stored in
                  buckets that require VPC authorization, you can set
                  their connection properties here.

                * **Exclusions** *(list) --*

                  A list of glob patterns used to exclude from the
                  crawl. For more information, see Catalog Tables with
                  a Crawler.

                  * *(string) --*

                * **MaximumTraversalDepth** *(integer) --*

                  The maximum depth of Amazon S3 paths that the
                  crawler can traverse to discover the Hudi metadata
                  folder in your Amazon S3 path. Used to limit the
                  crawler run time.

          * **DatabaseName** *(string) --*

            The name of the database in which the crawler's output is
            stored.

          * **Description** *(string) --*

            A description of the crawler.

          * **Classifiers** *(list) --*

            A list of UTF-8 strings that specify the custom
            classifiers that are associated with the crawler.

            * *(string) --*

          * **RecrawlPolicy** *(dict) --*

            A policy that specifies whether to crawl the entire
            dataset again, or to crawl only folders that were added
            since the last crawler run.

            * **RecrawlBehavior** *(string) --*

              Specifies whether to crawl the entire dataset again or
              to crawl only folders that were added since the last
              crawler run.

              A value of "CRAWL_EVERYTHING" specifies crawling the
              entire dataset again.

              A value of "CRAWL_NEW_FOLDERS_ONLY" specifies crawling
              only folders that were added since the last crawler run.

              A value of "CRAWL_EVENT_MODE" specifies crawling only
              the changes identified by Amazon S3 events.

          * **SchemaChangePolicy** *(dict) --*

            The policy that specifies update and delete behaviors for
            the crawler.

            * **UpdateBehavior** *(string) --*

              The update behavior when the crawler finds a changed
              schema.

            * **DeleteBehavior** *(string) --*

              The deletion behavior when the crawler finds a deleted
              object.

          * **LineageConfiguration** *(dict) --*

            A configuration that specifies whether data lineage is
            enabled for the crawler.

            * **CrawlerLineageSettings** *(string) --*

              Specifies whether data lineage is enabled for the
              crawler. Valid values are:

              * ENABLE: enables data lineage for the crawler

              * DISABLE: disables data lineage for the crawler

          * **State** *(string) --*

            Indicates whether the crawler is running, or whether a run
            is pending.

          * **TablePrefix** *(string) --*

            The prefix added to the names of tables that are created.

          * **Schedule** *(dict) --*

            For scheduled crawlers, the schedule when the crawler
            runs.

            * **ScheduleExpression** *(string) --*

              A "cron" expression used to specify the schedule (see
              Time-Based Schedules for Jobs and Crawlers. For example,
              to run something every day at 12:15 UTC, you would
              specify: "cron(15 12 * * ? *)".

            * **State** *(string) --*

              The state of the schedule.

          * **CrawlElapsedTime** *(integer) --*

            If the crawler is running, contains the total time elapsed
            since the last crawl began.

          * **CreationTime** *(datetime) --*

            The time that the crawler was created.

          * **LastUpdated** *(datetime) --*

            The time that the crawler was last updated.

          * **LastCrawl** *(dict) --*

            The status of the last crawl, and potentially error
            information if an error occurred.

            * **Status** *(string) --*

              Status of the last crawl.

            * **ErrorMessage** *(string) --*

              If an error occurred, the error information about the
              last crawl.

            * **LogGroup** *(string) --*

              The log group for the last crawl.

            * **LogStream** *(string) --*

              The log stream for the last crawl.

            * **MessagePrefix** *(string) --*

              The prefix for a message about this crawl.

            * **StartTime** *(datetime) --*

              The time at which the crawl started.

          * **Version** *(integer) --*

            The version of the crawler.

          * **Configuration** *(string) --*

            Crawler configuration information. This versioned JSON
            string allows users to specify aspects of a crawler's
            behavior. For more information, see Setting crawler
            configuration options.

          * **CrawlerSecurityConfiguration** *(string) --*

            The name of the "SecurityConfiguration" structure to be
            used by this crawler.

          * **LakeFormationConfiguration** *(dict) --*

            Specifies whether the crawler should use Lake Formation
            credentials for the crawler instead of the IAM role
            credentials.

            * **UseLakeFormationCredentials** *(boolean) --*

              Specifies whether to use Lake Formation credentials for
              the crawler instead of the IAM role credentials.

            * **AccountId** *(string) --*

              Required for cross account crawls. For same account
              crawls as the target data, this can be left as null.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / check_schema_version_validity


check_schema_version_validity
*****************************

Glue.Client.check_schema_version_validity(**kwargs)

   Validates the supplied schema. This call has no side effects, it
   simply validates using the supplied schema using "DataFormat" as
   the format. Since it does not take a schema set name, no
   compatibility checks are performed.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.check_schema_version_validity(
          DataFormat='AVRO'|'JSON'|'PROTOBUF',
          SchemaDefinition='string'
      )

   Parameters:
      * **DataFormat** (*string*) --

        **[REQUIRED]**

        The data format of the schema definition. Currently "AVRO",
        "JSON" and "PROTOBUF" are supported.

      * **SchemaDefinition** (*string*) --

        **[REQUIRED]**

        The definition of the schema that has to be validated.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Valid': True|False,
             'Error': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Valid** *(boolean) --*

          Return true, if the schema is valid and false otherwise.

        * **Error** *(string) --*

          A validation failure error message.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / get_triggers


get_triggers
************

Glue.Client.get_triggers(**kwargs)

   Gets all the triggers associated with a job.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_triggers(
          NextToken='string',
          DependentJobName='string',
          MaxResults=123
      )

   Parameters:
      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

      * **DependentJobName** (*string*) -- The name of the job to
        retrieve triggers for. The trigger that can start this job is
        returned, and if there is no such trigger, all triggers are
        returned.

      * **MaxResults** (*integer*) -- The maximum size of the
        response.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Triggers': [
                 {
                     'Name': 'string',
                     'WorkflowName': 'string',
                     'Id': 'string',
                     'Type': 'SCHEDULED'|'CONDITIONAL'|'ON_DEMAND'|'EVENT',
                     'State': 'CREATING'|'CREATED'|'ACTIVATING'|'ACTIVATED'|'DEACTIVATING'|'DEACTIVATED'|'DELETING'|'UPDATING',
                     'Description': 'string',
                     'Schedule': 'string',
                     'Actions': [
                         {
                             'JobName': 'string',
                             'Arguments': {
                                 'string': 'string'
                             },
                             'Timeout': 123,
                             'SecurityConfiguration': 'string',
                             'NotificationProperty': {
                                 'NotifyDelayAfter': 123
                             },
                             'CrawlerName': 'string'
                         },
                     ],
                     'Predicate': {
                         'Logical': 'AND'|'ANY',
                         'Conditions': [
                             {
                                 'LogicalOperator': 'EQUALS',
                                 'JobName': 'string',
                                 'State': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'|'ERROR'|'WAITING'|'EXPIRED',
                                 'CrawlerName': 'string',
                                 'CrawlState': 'RUNNING'|'CANCELLING'|'CANCELLED'|'SUCCEEDED'|'FAILED'|'ERROR'
                             },
                         ]
                     },
                     'EventBatchingCondition': {
                         'BatchSize': 123,
                         'BatchWindow': 123
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Triggers** *(list) --*

          A list of triggers for the specified job.

          * *(dict) --*

            Information about a specific trigger.

            * **Name** *(string) --*

              The name of the trigger.

            * **WorkflowName** *(string) --*

              The name of the workflow associated with the trigger.

            * **Id** *(string) --*

              Reserved for future use.

            * **Type** *(string) --*

              The type of trigger that this is.

            * **State** *(string) --*

              The current state of the trigger.

            * **Description** *(string) --*

              A description of this trigger.

            * **Schedule** *(string) --*

              A "cron" expression used to specify the schedule (see
              Time-Based Schedules for Jobs and Crawlers. For example,
              to run something every day at 12:15 UTC, you would
              specify: "cron(15 12 * * ? *)".

            * **Actions** *(list) --*

              The actions initiated by this trigger.

              * *(dict) --*

                Defines an action to be initiated by a trigger.

                * **JobName** *(string) --*

                  The name of a job to be run.

                * **Arguments** *(dict) --*

                  The job arguments used when this trigger fires. For
                  this job run, they replace the default arguments set
                  in the job definition itself.

                  You can specify arguments here that your own job-
                  execution script consumes, as well as arguments that
                  Glue itself consumes.

                  For information about how to specify and consume
                  your own Job arguments, see the Calling Glue APIs in
                  Python topic in the developer guide.

                  For information about the key-value pairs that Glue
                  consumes to set up your job, see the Special
                  Parameters Used by Glue topic in the developer
                  guide.

                  * *(string) --*

                    * *(string) --*

                * **Timeout** *(integer) --*

                  The "JobRun" timeout in minutes. This is the maximum
                  time that a job run can consume resources before it
                  is terminated and enters "TIMEOUT" status. This
                  overrides the timeout value set in the parent job.

                  Jobs must have timeout values less than 7 days or
                  10080 minutes. Otherwise, the jobs will throw an
                  exception.

                  When the value is left blank, the timeout is
                  defaulted to 2880 minutes.

                  Any existing Glue jobs that had a timeout value
                  greater than 7 days will be defaulted to 7 days. For
                  instance if you have specified a timeout of 20 days
                  for a batch job, it will be stopped on the 7th day.

                  For streaming jobs, if you have set up a maintenance
                  window, it will be restarted during the maintenance
                  window after 7 days.

                * **SecurityConfiguration** *(string) --*

                  The name of the "SecurityConfiguration" structure to
                  be used with this action.

                * **NotificationProperty** *(dict) --*

                  Specifies configuration properties of a job run
                  notification.

                  * **NotifyDelayAfter** *(integer) --*

                    After a job run starts, the number of minutes to
                    wait before sending a job run delay notification.

                * **CrawlerName** *(string) --*

                  The name of the crawler to be used with this action.

            * **Predicate** *(dict) --*

              The predicate of this trigger, which defines when it
              will fire.

              * **Logical** *(string) --*

                An optional field if only one condition is listed. If
                multiple conditions are listed, then this field is
                required.

              * **Conditions** *(list) --*

                A list of the conditions that determine when the
                trigger will fire.

                * *(dict) --*

                  Defines a condition under which a trigger fires.

                  * **LogicalOperator** *(string) --*

                    A logical operator.

                  * **JobName** *(string) --*

                    The name of the job whose "JobRuns" this condition
                    applies to, and on which this trigger waits.

                  * **State** *(string) --*

                    The condition state. Currently, the only job
                    states that a trigger can listen for are
                    "SUCCEEDED", "STOPPED", "FAILED", and "TIMEOUT".
                    The only crawler states that a trigger can listen
                    for are "SUCCEEDED", "FAILED", and "CANCELLED".

                  * **CrawlerName** *(string) --*

                    The name of the crawler to which this condition
                    applies.

                  * **CrawlState** *(string) --*

                    The state of the crawler to which this condition
                    applies.

            * **EventBatchingCondition** *(dict) --*

              Batch condition that must be met (specified number of
              events received or batch time window expired) before
              EventBridge event trigger fires.

              * **BatchSize** *(integer) --*

                Number of events that must be received from Amazon
                EventBridge before EventBridge event trigger fires.

              * **BatchWindow** *(integer) --*

                Window of time in seconds after which EventBridge
                event trigger fires. Window starts when first event is
                received.

        * **NextToken** *(string) --*

          A continuation token, if not all the requested triggers have
          yet been returned.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / put_schema_version_metadata


put_schema_version_metadata
***************************

Glue.Client.put_schema_version_metadata(**kwargs)

   Puts the metadata key value pair for a specified schema version ID.
   A maximum of 10 key value pairs will be allowed per schema version.
   They can be added over one or more calls.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.put_schema_version_metadata(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          SchemaVersionNumber={
              'LatestVersion': True|False,
              'VersionNumber': 123
          },
          SchemaVersionId='string',
          MetadataKeyValue={
              'MetadataKey': 'string',
              'MetadataValue': 'string'
          }
      )

   Parameters:
      * **SchemaId** (*dict*) --

        The unique ID for the schema.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **SchemaVersionNumber** (*dict*) --

        The version number of the schema.

        * **LatestVersion** *(boolean) --*

          The latest version available for the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

      * **SchemaVersionId** (*string*) -- The unique version ID of the
        schema version.

      * **MetadataKeyValue** (*dict*) --

        **[REQUIRED]**

        The metadata key's corresponding value.

        * **MetadataKey** *(string) --*

          A metadata key.

        * **MetadataValue** *(string) --*

          A metadata key’s corresponding value.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SchemaArn': 'string',
             'SchemaName': 'string',
             'RegistryName': 'string',
             'LatestVersion': True|False,
             'VersionNumber': 123,
             'SchemaVersionId': 'string',
             'MetadataKey': 'string',
             'MetadataValue': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) for the schema.

        * **SchemaName** *(string) --*

          The name for the schema.

        * **RegistryName** *(string) --*

          The name for the registry.

        * **LatestVersion** *(boolean) --*

          The latest version of the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

        * **SchemaVersionId** *(string) --*

          The unique version ID of the schema version.

        * **MetadataKey** *(string) --*

          The metadata key.

        * **MetadataValue** *(string) --*

          The value of the metadata key.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"
Glue / Client / delete_partition_index


delete_partition_index
**********************

Glue.Client.delete_partition_index(**kwargs)

   Deletes a specified partition index from an existing table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_partition_index(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          IndexName='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The catalog ID where the table
        resides.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        Specifies the name of a database from which you want to delete
        a partition index.

      * **TableName** (*string*) --

        **[REQUIRED]**

        Specifies the name of a table from which you want to delete a
        partition index.

      * **IndexName** (*string*) --

        **[REQUIRED]**

        The name of the partition index to be deleted.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.ConflictException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / update_job_from_source_control


update_job_from_source_control
******************************

Glue.Client.update_job_from_source_control(**kwargs)

   Synchronizes a job from the source control repository. This
   operation takes the job artifacts that are located in the remote
   repository and updates the Glue internal stores with these
   artifacts.

   This API supports optional parameters which take in the repository
   information.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_job_from_source_control(
          JobName='string',
          Provider='GITHUB'|'GITLAB'|'BITBUCKET'|'AWS_CODE_COMMIT',
          RepositoryName='string',
          RepositoryOwner='string',
          BranchName='string',
          Folder='string',
          CommitId='string',
          AuthStrategy='PERSONAL_ACCESS_TOKEN'|'AWS_SECRETS_MANAGER',
          AuthToken='string'
      )

   Parameters:
      * **JobName** (*string*) -- The name of the Glue job to be
        synchronized to or from the remote repository.

      * **Provider** (*string*) -- The provider for the remote
        repository. Possible values: GITHUB, AWS_CODE_COMMIT, GITLAB,
        BITBUCKET.

      * **RepositoryName** (*string*) -- The name of the remote
        repository that contains the job artifacts. For BitBucket
        providers, "RepositoryName" should include "WorkspaceName".
        Use the format "<WorkspaceName>/<RepositoryName>".

      * **RepositoryOwner** (*string*) -- The owner of the remote
        repository that contains the job artifacts.

      * **BranchName** (*string*) -- An optional branch in the remote
        repository.

      * **Folder** (*string*) -- An optional folder in the remote
        repository.

      * **CommitId** (*string*) -- A commit ID for a commit in the
        remote repository.

      * **AuthStrategy** (*string*) -- The type of authentication,
        which can be an authentication token stored in Amazon Web
        Services Secrets Manager, or a personal access token.

      * **AuthToken** (*string*) -- The value of the authorization
        token.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'JobName': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **JobName** *(string) --*

          The name of the Glue job.

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / update_crawler_schedule


update_crawler_schedule
***********************

Glue.Client.update_crawler_schedule(**kwargs)

   Updates the schedule of a crawler using a "cron" expression.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_crawler_schedule(
          CrawlerName='string',
          Schedule='string'
      )

   Parameters:
      * **CrawlerName** (*string*) --

        **[REQUIRED]**

        The name of the crawler whose schedule to update.

      * **Schedule** (*string*) -- The updated "cron" expression used
        to specify the schedule (see Time-Based Schedules for Jobs and
        Crawlers. For example, to run something every day at 12:15
        UTC, you would specify: "cron(15 12 * * ? *)".

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.VersionMismatchException"

   * "Glue.Client.exceptions.SchedulerTransitioningException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / describe_entity


describe_entity
***************

Glue.Client.describe_entity(**kwargs)

   Provides details regarding the entity used with the connection
   type, with a description of the data model for each field in the
   selected entity.

   The response includes all the fields which make up the entity.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.describe_entity(
          ConnectionName='string',
          CatalogId='string',
          EntityName='string',
          NextToken='string',
          DataStoreApiVersion='string'
      )

   Parameters:
      * **ConnectionName** (*string*) --

        **[REQUIRED]**

        The name of the connection that contains the connection type
        credentials.

      * **CatalogId** (*string*) -- The catalog ID of the catalog that
        contains the connection. This can be null, By default, the
        Amazon Web Services Account ID is the catalog ID.

      * **EntityName** (*string*) --

        **[REQUIRED]**

        The name of the entity that you want to describe from the
        connection type.

      * **NextToken** (*string*) -- A continuation token, included if
        this is a continuation call.

      * **DataStoreApiVersion** (*string*) -- The version of the API
        used for the data store.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Fields': [
                 {
                     'FieldName': 'string',
                     'Label': 'string',
                     'Description': 'string',
                     'FieldType': 'INT'|'SMALLINT'|'BIGINT'|'FLOAT'|'LONG'|'DATE'|'BOOLEAN'|'MAP'|'ARRAY'|'STRING'|'TIMESTAMP'|'DECIMAL'|'BYTE'|'SHORT'|'DOUBLE'|'STRUCT',
                     'IsPrimaryKey': True|False,
                     'IsNullable': True|False,
                     'IsRetrievable': True|False,
                     'IsFilterable': True|False,
                     'IsPartitionable': True|False,
                     'IsCreateable': True|False,
                     'IsUpdateable': True|False,
                     'IsUpsertable': True|False,
                     'IsDefaultOnCreate': True|False,
                     'SupportedValues': [
                         'string',
                     ],
                     'SupportedFilterOperators': [
                         'LESS_THAN'|'GREATER_THAN'|'BETWEEN'|'EQUAL_TO'|'NOT_EQUAL_TO'|'GREATER_THAN_OR_EQUAL_TO'|'LESS_THAN_OR_EQUAL_TO'|'CONTAINS'|'ORDER_BY',
                     ],
                     'ParentField': 'string',
                     'NativeDataType': 'string',
                     'CustomProperties': {
                         'string': 'string'
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Fields** *(list) --*

          Describes the fields for that connector entity. This is the
          list of "Field" objects. "Field" is very similar to column
          in a database. The "Field" object has information about
          different properties associated with fields in the
          connector.

          * *(dict) --*

            The "Field" object has information about the different
            properties associated with a field in the connector.

            * **FieldName** *(string) --*

              A unique identifier for the field.

            * **Label** *(string) --*

              A readable label used for the field.

            * **Description** *(string) --*

              A description of the field.

            * **FieldType** *(string) --*

              The type of data in the field.

            * **IsPrimaryKey** *(boolean) --*

              Indicates whether this field can used as a primary key
              for the given entity.

            * **IsNullable** *(boolean) --*

              Indicates whether this field can be nullable or not.

            * **IsRetrievable** *(boolean) --*

              Indicates whether this field can be added in Select
              clause of SQL query or whether it is retrievable or not.

            * **IsFilterable** *(boolean) --*

              Indicates whether this field can used in a filter clause
              ( "WHERE" clause) of a SQL statement when querying data.

            * **IsPartitionable** *(boolean) --*

              Indicates whether a given field can be used in
              partitioning the query made to SaaS.

            * **IsCreateable** *(boolean) --*

              Indicates whether this field can be created as part of a
              destination write.

            * **IsUpdateable** *(boolean) --*

              Indicates whether this field can be updated as part of a
              destination write.

            * **IsUpsertable** *(boolean) --*

              Indicates whether this field can be upserted as part of
              a destination write.

            * **IsDefaultOnCreate** *(boolean) --*

              Indicates whether this field is populated automatically
              when the object is created, such as a created at
              timestamp.

            * **SupportedValues** *(list) --*

              A list of supported values for the field.

              * *(string) --*

            * **SupportedFilterOperators** *(list) --*

              Indicates the support filter operators for this field.

              * *(string) --*

            * **ParentField** *(string) --*

              A parent field name for a nested field.

            * **NativeDataType** *(string) --*

              The data type returned by the SaaS API, such as
              “picklist” or “textarea” from Salesforce.

            * **CustomProperties** *(dict) --*

              Optional map of keys which may be returned.

              * *(string) --*

                * *(string) --*

        * **NextToken** *(string) --*

          A continuation token, present if the current segment is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.GlueEncryptionException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.FederationSourceException"

   * "Glue.Client.exceptions.AccessDeniedException"
Glue / Client / delete_resource_policy


delete_resource_policy
**********************

Glue.Client.delete_resource_policy(**kwargs)

   Deletes a specified policy.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_resource_policy(
          PolicyHashCondition='string',
          ResourceArn='string'
      )

   Parameters:
      * **PolicyHashCondition** (*string*) -- The hash value returned
        when this policy was set.

      * **ResourceArn** (*string*) -- The ARN of the Glue resource for
        the resource policy to be deleted.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.ConditionCheckFailureException"
Glue / Client / get_classifiers


get_classifiers
***************

Glue.Client.get_classifiers(**kwargs)

   Lists all classifier objects in the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_classifiers(
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **MaxResults** (*integer*) -- The size of the list to return
        (optional).

      * **NextToken** (*string*) -- An optional continuation token.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Classifiers': [
                 {
                     'GrokClassifier': {
                         'Name': 'string',
                         'Classification': 'string',
                         'CreationTime': datetime(2015, 1, 1),
                         'LastUpdated': datetime(2015, 1, 1),
                         'Version': 123,
                         'GrokPattern': 'string',
                         'CustomPatterns': 'string'
                     },
                     'XMLClassifier': {
                         'Name': 'string',
                         'Classification': 'string',
                         'CreationTime': datetime(2015, 1, 1),
                         'LastUpdated': datetime(2015, 1, 1),
                         'Version': 123,
                         'RowTag': 'string'
                     },
                     'JsonClassifier': {
                         'Name': 'string',
                         'CreationTime': datetime(2015, 1, 1),
                         'LastUpdated': datetime(2015, 1, 1),
                         'Version': 123,
                         'JsonPath': 'string'
                     },
                     'CsvClassifier': {
                         'Name': 'string',
                         'CreationTime': datetime(2015, 1, 1),
                         'LastUpdated': datetime(2015, 1, 1),
                         'Version': 123,
                         'Delimiter': 'string',
                         'QuoteSymbol': 'string',
                         'ContainsHeader': 'UNKNOWN'|'PRESENT'|'ABSENT',
                         'Header': [
                             'string',
                         ],
                         'DisableValueTrimming': True|False,
                         'AllowSingleColumn': True|False,
                         'CustomDatatypeConfigured': True|False,
                         'CustomDatatypes': [
                             'string',
                         ],
                         'Serde': 'OpenCSVSerDe'|'LazySimpleSerDe'|'None'
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Classifiers** *(list) --*

          The requested list of classifier objects.

          * *(dict) --*

            Classifiers are triggered during a crawl task. A
            classifier checks whether a given file is in a format it
            can handle. If it is, the classifier creates a schema in
            the form of a "StructType" object that matches that data
            format.

            You can use the standard classifiers that Glue provides,
            or you can write your own classifiers to best categorize
            your data sources and specify the appropriate schemas to
            use for them. A classifier can be a "grok" classifier, an
            "XML" classifier, a "JSON" classifier, or a custom "CSV"
            classifier, as specified in one of the fields in the
            "Classifier" object.

            * **GrokClassifier** *(dict) --*

              A classifier that uses "grok".

              * **Name** *(string) --*

                The name of the classifier.

              * **Classification** *(string) --*

                An identifier of the data format that the classifier
                matches, such as Twitter, JSON, Omniture logs, and so
                on.

              * **CreationTime** *(datetime) --*

                The time that this classifier was registered.

              * **LastUpdated** *(datetime) --*

                The time that this classifier was last updated.

              * **Version** *(integer) --*

                The version of this classifier.

              * **GrokPattern** *(string) --*

                The grok pattern applied to a data store by this
                classifier. For more information, see built-in
                patterns in Writing Custom Classifiers.

              * **CustomPatterns** *(string) --*

                Optional custom grok patterns defined by this
                classifier. For more information, see custom patterns
                in Writing Custom Classifiers.

            * **XMLClassifier** *(dict) --*

              A classifier for XML content.

              * **Name** *(string) --*

                The name of the classifier.

              * **Classification** *(string) --*

                An identifier of the data format that the classifier
                matches.

              * **CreationTime** *(datetime) --*

                The time that this classifier was registered.

              * **LastUpdated** *(datetime) --*

                The time that this classifier was last updated.

              * **Version** *(integer) --*

                The version of this classifier.

              * **RowTag** *(string) --*

                The XML tag designating the element that contains each
                record in an XML document being parsed. This can't
                identify a self-closing element (closed by "/>"). An
                empty row element that contains only attributes can be
                parsed as long as it ends with a closing tag (for
                example, "<row item_a="A" item_b="B"></row>" is okay,
                but "<row item_a="A" item_b="B" />" is not).

            * **JsonClassifier** *(dict) --*

              A classifier for JSON content.

              * **Name** *(string) --*

                The name of the classifier.

              * **CreationTime** *(datetime) --*

                The time that this classifier was registered.

              * **LastUpdated** *(datetime) --*

                The time that this classifier was last updated.

              * **Version** *(integer) --*

                The version of this classifier.

              * **JsonPath** *(string) --*

                A "JsonPath" string defining the JSON data for the
                classifier to classify. Glue supports a subset of
                JsonPath, as described in Writing JsonPath Custom
                Classifiers.

            * **CsvClassifier** *(dict) --*

              A classifier for comma-separated values (CSV).

              * **Name** *(string) --*

                The name of the classifier.

              * **CreationTime** *(datetime) --*

                The time that this classifier was registered.

              * **LastUpdated** *(datetime) --*

                The time that this classifier was last updated.

              * **Version** *(integer) --*

                The version of this classifier.

              * **Delimiter** *(string) --*

                A custom symbol to denote what separates each column
                entry in the row.

              * **QuoteSymbol** *(string) --*

                A custom symbol to denote what combines content into a
                single column value. It must be different from the
                column delimiter.

              * **ContainsHeader** *(string) --*

                Indicates whether the CSV file contains a header.

              * **Header** *(list) --*

                A list of strings representing column names.

                * *(string) --*

              * **DisableValueTrimming** *(boolean) --*

                Specifies not to trim values before identifying the
                type of column values. The default value is "true".

              * **AllowSingleColumn** *(boolean) --*

                Enables the processing of files that contain only one
                column.

              * **CustomDatatypeConfigured** *(boolean) --*

                Enables the custom datatype to be configured.

              * **CustomDatatypes** *(list) --*

                A list of custom datatypes including "BINARY",
                "BOOLEAN", "DATE", "DECIMAL", "DOUBLE", "FLOAT",
                "INT", "LONG", "SHORT", "STRING", "TIMESTAMP".

                * *(string) --*

              * **Serde** *(string) --*

                Sets the SerDe for processing CSV in the classifier,
                which will be applied in the Data Catalog. Valid
                values are "OpenCSVSerDe", "LazySimpleSerDe", and
                "None". You can specify the "None" value when you want
                the crawler to do the detection.

        * **NextToken** *(string) --*

          A continuation token.

   **Exceptions**

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / list_table_optimizer_runs


list_table_optimizer_runs
*************************

Glue.Client.list_table_optimizer_runs(**kwargs)

   Lists the history of previous optimizer runs for a specific table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_table_optimizer_runs(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          Type='compaction'|'retention'|'orphan_file_deletion',
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **CatalogId** (*string*) --

        **[REQUIRED]**

        The Catalog ID of the table.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database in the catalog in which the table
        resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table.

      * **Type** (*string*) --

        **[REQUIRED]**

        The type of table optimizer.

      * **MaxResults** (*integer*) -- The maximum number of optimizer
        runs to return on each call.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'CatalogId': 'string',
             'DatabaseName': 'string',
             'TableName': 'string',
             'NextToken': 'string',
             'TableOptimizerRuns': [
                 {
                     'eventType': 'starting'|'completed'|'failed'|'in_progress',
                     'startTimestamp': datetime(2015, 1, 1),
                     'endTimestamp': datetime(2015, 1, 1),
                     'metrics': {
                         'NumberOfBytesCompacted': 'string',
                         'NumberOfFilesCompacted': 'string',
                         'NumberOfDpus': 'string',
                         'JobDurationInHour': 'string'
                     },
                     'error': 'string',
                     'compactionMetrics': {
                         'IcebergMetrics': {
                             'NumberOfBytesCompacted': 123,
                             'NumberOfFilesCompacted': 123,
                             'DpuHours': 123.0,
                             'NumberOfDpus': 123,
                             'JobDurationInHour': 123.0
                         }
                     },
                     'compactionStrategy': 'binpack'|'sort'|'z-order',
                     'retentionMetrics': {
                         'IcebergMetrics': {
                             'NumberOfDataFilesDeleted': 123,
                             'NumberOfManifestFilesDeleted': 123,
                             'NumberOfManifestListsDeleted': 123,
                             'DpuHours': 123.0,
                             'NumberOfDpus': 123,
                             'JobDurationInHour': 123.0
                         }
                     },
                     'orphanFileDeletionMetrics': {
                         'IcebergMetrics': {
                             'NumberOfOrphanFilesDeleted': 123,
                             'DpuHours': 123.0,
                             'NumberOfDpus': 123,
                             'JobDurationInHour': 123.0
                         }
                     }
                 },
             ]
         }

      **Response Structure**

      * *(dict) --*

        * **CatalogId** *(string) --*

          The Catalog ID of the table.

        * **DatabaseName** *(string) --*

          The name of the database in the catalog in which the table
          resides.

        * **TableName** *(string) --*

          The name of the table.

        * **NextToken** *(string) --*

          A continuation token for paginating the returned list of
          optimizer runs, returned if the current segment of the list
          is not the last.

        * **TableOptimizerRuns** *(list) --*

          A list of the optimizer runs associated with a table.

          * *(dict) --*

            Contains details for a table optimizer run.

            * **eventType** *(string) --*

              An event type representing the status of the table
              optimizer run.

            * **startTimestamp** *(datetime) --*

              Represents the epoch timestamp at which the compaction
              job was started within Lake Formation.

            * **endTimestamp** *(datetime) --*

              Represents the epoch timestamp at which the compaction
              job ended.

            * **metrics** *(dict) --*

              A "RunMetrics" object containing metrics for the
              optimizer run.

              This member is deprecated. See the individual metric
              members for compaction, retention, and orphan file
              deletion.

              * **NumberOfBytesCompacted** *(string) --*

                The number of bytes removed by the compaction job run.

              * **NumberOfFilesCompacted** *(string) --*

                The number of files removed by the compaction job run.

              * **NumberOfDpus** *(string) --*

                The number of DPUs consumed by the job, rounded up to
                the nearest whole number.

              * **JobDurationInHour** *(string) --*

                The duration of the job in hours.

            * **error** *(string) --*

              An error that occured during the optimizer run.

            * **compactionMetrics** *(dict) --*

              A "CompactionMetrics" object containing metrics for the
              optimizer run.

              * **IcebergMetrics** *(dict) --*

                A structure containing the Iceberg compaction metrics
                for the optimizer run.

                * **NumberOfBytesCompacted** *(integer) --*

                  The number of bytes removed by the compaction job
                  run.

                * **NumberOfFilesCompacted** *(integer) --*

                  The number of files removed by the compaction job
                  run.

                * **DpuHours** *(float) --*

                  The number of DPU hours consumed by the job.

                * **NumberOfDpus** *(integer) --*

                  The number of DPUs consumed by the job, rounded up
                  to the nearest whole number.

                * **JobDurationInHour** *(float) --*

                  The duration of the job in hours.

            * **compactionStrategy** *(string) --*

              The strategy used for the compaction run. Indicates
              which algorithm was applied to determine how files were
              selected and combined during the compaction process.
              Valid values are:

              * "binpack": Combines small files into larger files,
                typically targeting sizes over 100MB, while applying
                any pending deletes. This is the recommended
                compaction strategy for most use cases.

              * "sort": Organizes data based on specified columns
                which are sorted hierarchically during compaction,
                improving query performance for filtered operations.
                This strategy is recommended when your queries
                frequently filter on specific columns. To use this
                strategy, you must first define a sort order in your
                Iceberg table properties using the "sort_order" table
                property.

              * "z-order": Optimizes data organization by blending
                multiple attributes into a single scalar value that
                can be used for sorting, allowing efficient querying
                across multiple dimensions. This strategy is
                recommended when you need to query data across
                multiple dimensions simultaneously. To use this
                strategy, you must first define a sort order in your
                Iceberg table properties using the "sort_order" table
                property.

            * **retentionMetrics** *(dict) --*

              A "RetentionMetrics" object containing metrics for the
              optimizer run.

              * **IcebergMetrics** *(dict) --*

                A structure containing the Iceberg retention metrics
                for the optimizer run.

                * **NumberOfDataFilesDeleted** *(integer) --*

                  The number of data files deleted by the retention
                  job run.

                * **NumberOfManifestFilesDeleted** *(integer) --*

                  The number of manifest files deleted by the
                  retention job run.

                * **NumberOfManifestListsDeleted** *(integer) --*

                  The number of manifest lists deleted by the
                  retention job run.

                * **DpuHours** *(float) --*

                  The number of DPU hours consumed by the job.

                * **NumberOfDpus** *(integer) --*

                  The number of DPUs consumed by the job, rounded up
                  to the nearest whole number.

                * **JobDurationInHour** *(float) --*

                  The duration of the job in hours.

            * **orphanFileDeletionMetrics** *(dict) --*

              An "OrphanFileDeletionMetrics" object containing metrics
              for the optimizer run.

              * **IcebergMetrics** *(dict) --*

                A structure containing the Iceberg orphan file
                deletion metrics for the optimizer run.

                * **NumberOfOrphanFilesDeleted** *(integer) --*

                  The number of orphan files deleted by the orphan
                  file deletion job run.

                * **DpuHours** *(float) --*

                  The number of DPU hours consumed by the job.

                * **NumberOfDpus** *(integer) --*

                  The number of DPUs consumed by the job, rounded up
                  to the nearest whole number.

                * **JobDurationInHour** *(float) --*

                  The duration of the job in hours.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.ValidationException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.ThrottlingException"
Glue / Client / delete_column_statistics_for_table


delete_column_statistics_for_table
**********************************

Glue.Client.delete_column_statistics_for_table(**kwargs)

   Retrieves table statistics of columns.

   The Identity and Access Management (IAM) permission required for
   this operation is "DeleteTable".

   See also: AWS API Documentation

   **Request Syntax**

      response = client.delete_column_statistics_for_table(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          ColumnName='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the partitions in question reside. If none is supplied, the
        Amazon Web Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the catalog database where the partitions reside.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the partitions' table.

      * **ColumnName** (*string*) --

        **[REQUIRED]**

        The name of the column.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / get_crawler_metrics


get_crawler_metrics
*******************

Glue.Client.get_crawler_metrics(**kwargs)

   Retrieves metrics about specified crawlers.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_crawler_metrics(
          CrawlerNameList=[
              'string',
          ],
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **CrawlerNameList** (*list*) --

        A list of the names of crawlers about which to retrieve
        metrics.

        * *(string) --*

      * **MaxResults** (*integer*) -- The maximum size of a list to
        return.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'CrawlerMetricsList': [
                 {
                     'CrawlerName': 'string',
                     'TimeLeftSeconds': 123.0,
                     'StillEstimating': True|False,
                     'LastRuntimeSeconds': 123.0,
                     'MedianRuntimeSeconds': 123.0,
                     'TablesCreated': 123,
                     'TablesUpdated': 123,
                     'TablesDeleted': 123
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **CrawlerMetricsList** *(list) --*

          A list of metrics for the specified crawler.

          * *(dict) --*

            Metrics for a specified crawler.

            * **CrawlerName** *(string) --*

              The name of the crawler.

            * **TimeLeftSeconds** *(float) --*

              The estimated time left to complete a running crawl.

            * **StillEstimating** *(boolean) --*

              True if the crawler is still estimating how long it will
              take to complete this run.

            * **LastRuntimeSeconds** *(float) --*

              The duration of the crawler's most recent run, in
              seconds.

            * **MedianRuntimeSeconds** *(float) --*

              The median duration of this crawler's runs, in seconds.

            * **TablesCreated** *(integer) --*

              The number of tables created by this crawler.

            * **TablesUpdated** *(integer) --*

              The number of tables updated by this crawler.

            * **TablesDeleted** *(integer) --*

              The number of tables deleted by this crawler.

        * **NextToken** *(string) --*

          A continuation token, if the returned list does not contain
          the last metric available.

   **Exceptions**

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / create_classifier


create_classifier
*****************

Glue.Client.create_classifier(**kwargs)

   Creates a classifier in the user's account. This can be a
   "GrokClassifier", an "XMLClassifier", a "JsonClassifier", or a
   "CsvClassifier", depending on which field of the request is
   present.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.create_classifier(
          GrokClassifier={
              'Classification': 'string',
              'Name': 'string',
              'GrokPattern': 'string',
              'CustomPatterns': 'string'
          },
          XMLClassifier={
              'Classification': 'string',
              'Name': 'string',
              'RowTag': 'string'
          },
          JsonClassifier={
              'Name': 'string',
              'JsonPath': 'string'
          },
          CsvClassifier={
              'Name': 'string',
              'Delimiter': 'string',
              'QuoteSymbol': 'string',
              'ContainsHeader': 'UNKNOWN'|'PRESENT'|'ABSENT',
              'Header': [
                  'string',
              ],
              'DisableValueTrimming': True|False,
              'AllowSingleColumn': True|False,
              'CustomDatatypeConfigured': True|False,
              'CustomDatatypes': [
                  'string',
              ],
              'Serde': 'OpenCSVSerDe'|'LazySimpleSerDe'|'None'
          }
      )

   Parameters:
      * **GrokClassifier** (*dict*) --

        A "GrokClassifier" object specifying the classifier to create.

        * **Classification** *(string) --* **[REQUIRED]**

          An identifier of the data format that the classifier
          matches, such as Twitter, JSON, Omniture logs, Amazon
          CloudWatch Logs, and so on.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the new classifier.

        * **GrokPattern** *(string) --* **[REQUIRED]**

          The grok pattern used by this classifier.

        * **CustomPatterns** *(string) --*

          Optional custom grok patterns used by this classifier.

      * **XMLClassifier** (*dict*) --

        An "XMLClassifier" object specifying the classifier to create.

        * **Classification** *(string) --* **[REQUIRED]**

          An identifier of the data format that the classifier
          matches.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the classifier.

        * **RowTag** *(string) --*

          The XML tag designating the element that contains each
          record in an XML document being parsed. This can't identify
          a self-closing element (closed by "/>"). An empty row
          element that contains only attributes can be parsed as long
          as it ends with a closing tag (for example, "<row item_a="A"
          item_b="B"></row>" is okay, but "<row item_a="A" item_b="B"
          />" is not).

      * **JsonClassifier** (*dict*) --

        A "JsonClassifier" object specifying the classifier to create.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the classifier.

        * **JsonPath** *(string) --* **[REQUIRED]**

          A "JsonPath" string defining the JSON data for the
          classifier to classify. Glue supports a subset of JsonPath,
          as described in Writing JsonPath Custom Classifiers.

      * **CsvClassifier** (*dict*) --

        A "CsvClassifier" object specifying the classifier to create.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the classifier.

        * **Delimiter** *(string) --*

          A custom symbol to denote what separates each column entry
          in the row.

        * **QuoteSymbol** *(string) --*

          A custom symbol to denote what combines content into a
          single column value. Must be different from the column
          delimiter.

        * **ContainsHeader** *(string) --*

          Indicates whether the CSV file contains a header.

        * **Header** *(list) --*

          A list of strings representing column names.

          * *(string) --*

        * **DisableValueTrimming** *(boolean) --*

          Specifies not to trim values before identifying the type of
          column values. The default value is true.

        * **AllowSingleColumn** *(boolean) --*

          Enables the processing of files that contain only one
          column.

        * **CustomDatatypeConfigured** *(boolean) --*

          Enables the configuration of custom datatypes.

        * **CustomDatatypes** *(list) --*

          Creates a list of supported custom datatypes.

          * *(string) --*

        * **Serde** *(string) --*

          Sets the SerDe for processing CSV in the classifier, which
          will be applied in the Data Catalog. Valid values are
          "OpenCSVSerDe", "LazySimpleSerDe", and "None". You can
          specify the "None" value when you want the crawler to do the
          detection.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.AlreadyExistsException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / update_connection


update_connection
*****************

Glue.Client.update_connection(**kwargs)

   Updates a connection definition in the Data Catalog.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.update_connection(
          CatalogId='string',
          Name='string',
          ConnectionInput={
              'Name': 'string',
              'Description': 'string',
              'ConnectionType': 'JDBC'|'SFTP'|'MONGODB'|'KAFKA'|'NETWORK'|'MARKETPLACE'|'CUSTOM'|'SALESFORCE'|'VIEW_VALIDATION_REDSHIFT'|'VIEW_VALIDATION_ATHENA'|'GOOGLEADS'|'GOOGLESHEETS'|'GOOGLEANALYTICS4'|'SERVICENOW'|'MARKETO'|'SAPODATA'|'ZENDESK'|'JIRACLOUD'|'NETSUITEERP'|'HUBSPOT'|'FACEBOOKADS'|'INSTAGRAMADS'|'ZOHOCRM'|'SALESFORCEPARDOT'|'SALESFORCEMARKETINGCLOUD'|'SLACK'|'STRIPE'|'INTERCOM'|'SNAPCHATADS',
              'MatchCriteria': [
                  'string',
              ],
              'ConnectionProperties': {
                  'string': 'string'
              },
              'SparkProperties': {
                  'string': 'string'
              },
              'AthenaProperties': {
                  'string': 'string'
              },
              'PythonProperties': {
                  'string': 'string'
              },
              'PhysicalConnectionRequirements': {
                  'SubnetId': 'string',
                  'SecurityGroupIdList': [
                      'string',
                  ],
                  'AvailabilityZone': 'string'
              },
              'AuthenticationConfiguration': {
                  'AuthenticationType': 'BASIC'|'OAUTH2'|'CUSTOM'|'IAM',
                  'OAuth2Properties': {
                      'OAuth2GrantType': 'AUTHORIZATION_CODE'|'CLIENT_CREDENTIALS'|'JWT_BEARER',
                      'OAuth2ClientApplication': {
                          'UserManagedClientApplicationClientId': 'string',
                          'AWSManagedClientApplicationReference': 'string'
                      },
                      'TokenUrl': 'string',
                      'TokenUrlParametersMap': {
                          'string': 'string'
                      },
                      'AuthorizationCodeProperties': {
                          'AuthorizationCode': 'string',
                          'RedirectUri': 'string'
                      },
                      'OAuth2Credentials': {
                          'UserManagedClientApplicationClientSecret': 'string',
                          'AccessToken': 'string',
                          'RefreshToken': 'string',
                          'JwtToken': 'string'
                      }
                  },
                  'SecretArn': 'string',
                  'KmsKeyArn': 'string',
                  'BasicAuthenticationCredentials': {
                      'Username': 'string',
                      'Password': 'string'
                  },
                  'CustomAuthenticationCredentials': {
                      'string': 'string'
                  }
              },
              'ValidateCredentials': True|False,
              'ValidateForComputeEnvironments': [
                  'SPARK'|'ATHENA'|'PYTHON',
              ]
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog in
        which the connection resides. If none is provided, the Amazon
        Web Services account ID is used by default.

      * **Name** (*string*) --

        **[REQUIRED]**

        The name of the connection definition to update.

      * **ConnectionInput** (*dict*) --

        **[REQUIRED]**

        A "ConnectionInput" object that redefines the connection in
        question.

        * **Name** *(string) --* **[REQUIRED]**

          The name of the connection.

        * **Description** *(string) --*

          The description of the connection.

        * **ConnectionType** *(string) --* **[REQUIRED]**

          The type of the connection. Currently, these types are
          supported:

          * "JDBC" - Designates a connection to a database through
            Java Database Connectivity (JDBC). "JDBC" Connections use
            the following ConnectionParameters.

            * Required: All of ( "HOST", "PORT", "JDBC_ENGINE") or
              "JDBC_CONNECTION_URL".

            * Required: All of ( "USERNAME", "PASSWORD") or
              "SECRET_ID".

            * Optional: "JDBC_ENFORCE_SSL", "CUSTOM_JDBC_CERT",
              "CUSTOM_JDBC_CERT_STRING",
              "SKIP_CUSTOM_JDBC_CERT_VALIDATION". These parameters are
              used to configure SSL with JDBC.

          * "KAFKA" - Designates a connection to an Apache Kafka
            streaming platform. "KAFKA" Connections use the following
            ConnectionParameters.

            * Required: "KAFKA_BOOTSTRAP_SERVERS".

            * Optional: "KAFKA_SSL_ENABLED", "KAFKA_CUSTOM_CERT",
              "KAFKA_SKIP_CUSTOM_CERT_VALIDATION". These parameters
              are used to configure SSL with "KAFKA".

            * Optional: "KAFKA_CLIENT_KEYSTORE",
              "KAFKA_CLIENT_KEYSTORE_PASSWORD",
              "KAFKA_CLIENT_KEY_PASSWORD",
              "ENCRYPTED_KAFKA_CLIENT_KEYSTORE_PASSWORD",
              "ENCRYPTED_KAFKA_CLIENT_KEY_PASSWORD". These parameters
              are used to configure TLS client configuration with SSL
              in "KAFKA".

            * Optional: "KAFKA_SASL_MECHANISM". Can be specified as
              "SCRAM-SHA-512", "GSSAPI", or "AWS_MSK_IAM".

            * Optional: "KAFKA_SASL_SCRAM_USERNAME",
              "KAFKA_SASL_SCRAM_PASSWORD",
              "ENCRYPTED_KAFKA_SASL_SCRAM_PASSWORD". These parameters
              are used to configure SASL/SCRAM-SHA-512 authentication
              with "KAFKA".

            * Optional: "KAFKA_SASL_GSSAPI_KEYTAB",
              "KAFKA_SASL_GSSAPI_KRB5_CONF",
              "KAFKA_SASL_GSSAPI_SERVICE",
              "KAFKA_SASL_GSSAPI_PRINCIPAL". These parameters are used
              to configure SASL/GSSAPI authentication with "KAFKA".

          * "MONGODB" - Designates a connection to a MongoDB document
            database. "MONGODB" Connections use the following
            ConnectionParameters.

            * Required: "CONNECTION_URL".

            * Required: All of ( "USERNAME", "PASSWORD") or
              "SECRET_ID".

          * "VIEW_VALIDATION_REDSHIFT" - Designates a connection used
            for view validation by Amazon Redshift.

          * "VIEW_VALIDATION_ATHENA" - Designates a connection used
            for view validation by Amazon Athena.

          * "NETWORK" - Designates a network connection to a data
            source within an Amazon Virtual Private Cloud environment
            (Amazon VPC). "NETWORK" Connections do not require
            ConnectionParameters. Instead, provide a
            PhysicalConnectionRequirements.

          * "MARKETPLACE" - Uses configuration settings contained in a
            connector purchased from Amazon Web Services Marketplace
            to read from and write to data stores that are not
            natively supported by Glue. "MARKETPLACE" Connections use
            the following ConnectionParameters.

            * Required: "CONNECTOR_TYPE", "CONNECTOR_URL",
              "CONNECTOR_CLASS_NAME", "CONNECTION_URL".

            * Required for "JDBC" "CONNECTOR_TYPE" connections: All of
              ( "USERNAME", "PASSWORD") or "SECRET_ID".

          * "CUSTOM" - Uses configuration settings contained in a
            custom connector to read from and write to data stores
            that are not natively supported by Glue.

          Additionally, a "ConnectionType" for the following SaaS
          connectors is supported:

          * "FACEBOOKADS" - Designates a connection to Facebook Ads.

          * "GOOGLEADS" - Designates a connection to Google Ads.

          * "GOOGLESHEETS" - Designates a connection to Google Sheets.

          * "GOOGLEANALYTICS4" - Designates a connection to Google
            Analytics 4.

          * "HUBSPOT" - Designates a connection to HubSpot.

          * "INSTAGRAMADS" - Designates a connection to Instagram Ads.

          * "INTERCOM" - Designates a connection to Intercom.

          * "JIRACLOUD" - Designates a connection to Jira Cloud.

          * "MARKETO" - Designates a connection to Adobe Marketo
            Engage.

          * "NETSUITEERP" - Designates a connection to Oracle
            NetSuite.

          * "SALESFORCE" - Designates a connection to Salesforce using
            OAuth authentication.

          * "SALESFORCEMARKETINGCLOUD" - Designates a connection to
            Salesforce Marketing Cloud.

          * "SALESFORCEPARDOT" - Designates a connection to Salesforce
            Marketing Cloud Account Engagement (MCAE).

          * "SAPODATA" - Designates a connection to SAP OData.

          * "SERVICENOW" - Designates a connection to ServiceNow.

          * "SLACK" - Designates a connection to Slack.

          * "SNAPCHATADS" - Designates a connection to Snapchat Ads.

          * "STRIPE" - Designates a connection to Stripe.

          * "ZENDESK" - Designates a connection to Zendesk.

          * "ZOHOCRM" - Designates a connection to Zoho CRM.

          For more information on the connection parameters needed for
          a particular connector, see the documentation for the
          connector in >>`<<Adding an Glue connection
          <https://docs.aws.amazon.com/glue/latest/dg/console-
          connections.html>`__in the Glue User Guide.

          "SFTP" is not supported.

          For more information about how optional ConnectionProperties
          are used to configure features in Glue, consult Glue
          connection properties.

          For more information about how optional ConnectionProperties
          are used to configure features in Glue Studio, consult Using
          connectors and connections.

        * **MatchCriteria** *(list) --*

          A list of criteria that can be used in selecting this
          connection.

          * *(string) --*

        * **ConnectionProperties** *(dict) --* **[REQUIRED]**

          These key-value pairs define parameters for the connection.

          * *(string) --*

            * *(string) --*

        * **SparkProperties** *(dict) --*

          Connection properties specific to the Spark compute
          environment.

          * *(string) --*

            * *(string) --*

        * **AthenaProperties** *(dict) --*

          Connection properties specific to the Athena compute
          environment.

          * *(string) --*

            * *(string) --*

        * **PythonProperties** *(dict) --*

          Connection properties specific to the Python compute
          environment.

          * *(string) --*

            * *(string) --*

        * **PhysicalConnectionRequirements** *(dict) --*

          The physical connection requirements, such as virtual
          private cloud (VPC) and "SecurityGroup", that are needed to
          successfully make this connection.

          * **SubnetId** *(string) --*

            The subnet ID used by the connection.

          * **SecurityGroupIdList** *(list) --*

            The security group ID list used by the connection.

            * *(string) --*

          * **AvailabilityZone** *(string) --*

            The connection's Availability Zone.

        * **AuthenticationConfiguration** *(dict) --*

          The authentication properties of the connection.

          * **AuthenticationType** *(string) --*

            A structure containing the authentication configuration in
            the CreateConnection request.

          * **OAuth2Properties** *(dict) --*

            The properties for OAuth2 authentication in the
            CreateConnection request.

            * **OAuth2GrantType** *(string) --*

              The OAuth2 grant type in the CreateConnection request.
              For example, "AUTHORIZATION_CODE", "JWT_BEARER", or
              "CLIENT_CREDENTIALS".

            * **OAuth2ClientApplication** *(dict) --*

              The client application type in the CreateConnection
              request. For example, "AWS_MANAGED" or "USER_MANAGED".

              * **UserManagedClientApplicationClientId** *(string) --*

                The client application clientID if the ClientAppType
                is "USER_MANAGED".

              * **AWSManagedClientApplicationReference** *(string) --*

                The reference to the SaaS-side client app that is
                Amazon Web Services managed.

            * **TokenUrl** *(string) --*

              The URL of the provider's authentication server, to
              exchange an authorization code for an access token.

            * **TokenUrlParametersMap** *(dict) --*

              A map of parameters that are added to the token "GET"
              request.

              * *(string) --*

                * *(string) --*

            * **AuthorizationCodeProperties** *(dict) --*

              The set of properties required for the the OAuth2
              "AUTHORIZATION_CODE" grant type.

              * **AuthorizationCode** *(string) --*

                An authorization code to be used in the third leg of
                the "AUTHORIZATION_CODE" grant workflow. This is a
                single-use code which becomes invalid once exchanged
                for an access token, thus it is acceptable to have
                this value as a request parameter.

              * **RedirectUri** *(string) --*

                The redirect URI where the user gets redirected to by
                authorization server when issuing an authorization
                code. The URI is subsequently used when the
                authorization code is exchanged for an access token.

            * **OAuth2Credentials** *(dict) --*

              The credentials used when the authentication type is
              OAuth2 authentication.

              * **UserManagedClientApplicationClientSecret** *(string)
                --*

                The client application client secret if the client
                application is user managed.

              * **AccessToken** *(string) --*

                The access token used when the authentication type is
                OAuth2.

              * **RefreshToken** *(string) --*

                The refresh token used when the authentication type is
                OAuth2.

              * **JwtToken** *(string) --*

                The JSON Web Token (JWT) used when the authentication
                type is OAuth2.

          * **SecretArn** *(string) --*

            The secret manager ARN to store credentials in the
            CreateConnection request.

          * **KmsKeyArn** *(string) --*

            The ARN of the KMS key used to encrypt the connection.
            Only taken an as input in the request and stored in the
            Secret Manager.

          * **BasicAuthenticationCredentials** *(dict) --*

            The credentials used when the authentication type is basic
            authentication.

            * **Username** *(string) --*

              The username to connect to the data source.

            * **Password** *(string) --*

              The password to connect to the data source.

          * **CustomAuthenticationCredentials** *(dict) --*

            The credentials used when the authentication type is
            custom authentication.

            * *(string) --*

              * *(string) --*

        * **ValidateCredentials** *(boolean) --*

          A flag to validate the credentials during create connection.
          Default is true.

        * **ValidateForComputeEnvironments** *(list) --*

          The compute environments that the specified connection
          properties are validated against.

          * *(string) --*

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / remove_schema_version_metadata


remove_schema_version_metadata
******************************

Glue.Client.remove_schema_version_metadata(**kwargs)

   Removes a key value pair from the schema version metadata for the
   specified schema version ID.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.remove_schema_version_metadata(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          SchemaVersionNumber={
              'LatestVersion': True|False,
              'VersionNumber': 123
          },
          SchemaVersionId='string',
          MetadataKeyValue={
              'MetadataKey': 'string',
              'MetadataValue': 'string'
          }
      )

   Parameters:
      * **SchemaId** (*dict*) --

        A wrapper structure that may contain the schema name and
        Amazon Resource Name (ARN).

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **SchemaVersionNumber** (*dict*) --

        The version number of the schema.

        * **LatestVersion** *(boolean) --*

          The latest version available for the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

      * **SchemaVersionId** (*string*) -- The unique version ID of the
        schema version.

      * **MetadataKeyValue** (*dict*) --

        **[REQUIRED]**

        The value of the metadata key.

        * **MetadataKey** *(string) --*

          A metadata key.

        * **MetadataValue** *(string) --*

          A metadata key’s corresponding value.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SchemaArn': 'string',
             'SchemaName': 'string',
             'RegistryName': 'string',
             'LatestVersion': True|False,
             'VersionNumber': 123,
             'SchemaVersionId': 'string',
             'MetadataKey': 'string',
             'MetadataValue': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema.

        * **SchemaName** *(string) --*

          The name of the schema.

        * **RegistryName** *(string) --*

          The name of the registry.

        * **LatestVersion** *(boolean) --*

          The latest version of the schema.

        * **VersionNumber** *(integer) --*

          The version number of the schema.

        * **SchemaVersionId** *(string) --*

          The version ID for the schema version.

        * **MetadataKey** *(string) --*

          The metadata key.

        * **MetadataValue** *(string) --*

          The value of the metadata key.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"
Glue / Client / get_registry


get_registry
************

Glue.Client.get_registry(**kwargs)

   Describes the specified registry in detail.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_registry(
          RegistryId={
              'RegistryName': 'string',
              'RegistryArn': 'string'
          }
      )

   Parameters:
      **RegistryId** (*dict*) --

      **[REQUIRED]**

      This is a wrapper structure that may contain the registry name
      and Amazon Resource Name (ARN).

      * **RegistryName** *(string) --*

        Name of the registry. Used only for lookup. One of
        "RegistryArn" or "RegistryName" has to be provided.

      * **RegistryArn** *(string) --*

        Arn of the registry to be updated. One of "RegistryArn" or
        "RegistryName" has to be provided.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'RegistryName': 'string',
             'RegistryArn': 'string',
             'Description': 'string',
             'Status': 'AVAILABLE'|'DELETING',
             'CreatedTime': 'string',
             'UpdatedTime': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **RegistryName** *(string) --*

          The name of the registry.

        * **RegistryArn** *(string) --*

          The Amazon Resource Name (ARN) of the registry.

        * **Description** *(string) --*

          A description of the registry.

        * **Status** *(string) --*

          The status of the registry.

        * **CreatedTime** *(string) --*

          The date and time the registry was created.

        * **UpdatedTime** *(string) --*

          The date and time the registry was updated.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / get_table_version


get_table_version
*****************

Glue.Client.get_table_version(**kwargs)

   Retrieves a specified version of a table.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_table_version(
          CatalogId='string',
          DatabaseName='string',
          TableName='string',
          VersionId='string'
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog where
        the tables reside. If none is provided, the Amazon Web
        Services account ID is used by default.

      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The database in the catalog in which the table resides. For
        Hive compatibility, this name is entirely lowercase.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table. For Hive compatibility, this name is
        entirely lowercase.

      * **VersionId** (*string*) -- The ID value of the table version
        to be retrieved. A "VersionID" is a string representation of
        an integer. Each version is incremented by 1.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TableVersion': {
                 'Table': {
                     'Name': 'string',
                     'DatabaseName': 'string',
                     'Description': 'string',
                     'Owner': 'string',
                     'CreateTime': datetime(2015, 1, 1),
                     'UpdateTime': datetime(2015, 1, 1),
                     'LastAccessTime': datetime(2015, 1, 1),
                     'LastAnalyzedTime': datetime(2015, 1, 1),
                     'Retention': 123,
                     'StorageDescriptor': {
                         'Columns': [
                             {
                                 'Name': 'string',
                                 'Type': 'string',
                                 'Comment': 'string',
                                 'Parameters': {
                                     'string': 'string'
                                 }
                             },
                         ],
                         'Location': 'string',
                         'AdditionalLocations': [
                             'string',
                         ],
                         'InputFormat': 'string',
                         'OutputFormat': 'string',
                         'Compressed': True|False,
                         'NumberOfBuckets': 123,
                         'SerdeInfo': {
                             'Name': 'string',
                             'SerializationLibrary': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                         'BucketColumns': [
                             'string',
                         ],
                         'SortColumns': [
                             {
                                 'Column': 'string',
                                 'SortOrder': 123
                             },
                         ],
                         'Parameters': {
                             'string': 'string'
                         },
                         'SkewedInfo': {
                             'SkewedColumnNames': [
                                 'string',
                             ],
                             'SkewedColumnValues': [
                                 'string',
                             ],
                             'SkewedColumnValueLocationMaps': {
                                 'string': 'string'
                             }
                         },
                         'StoredAsSubDirectories': True|False,
                         'SchemaReference': {
                             'SchemaId': {
                                 'SchemaArn': 'string',
                                 'SchemaName': 'string',
                                 'RegistryName': 'string'
                             },
                             'SchemaVersionId': 'string',
                             'SchemaVersionNumber': 123
                         }
                     },
                     'PartitionKeys': [
                         {
                             'Name': 'string',
                             'Type': 'string',
                             'Comment': 'string',
                             'Parameters': {
                                 'string': 'string'
                             }
                         },
                     ],
                     'ViewOriginalText': 'string',
                     'ViewExpandedText': 'string',
                     'TableType': 'string',
                     'Parameters': {
                         'string': 'string'
                     },
                     'CreatedBy': 'string',
                     'IsRegisteredWithLakeFormation': True|False,
                     'TargetTable': {
                         'CatalogId': 'string',
                         'DatabaseName': 'string',
                         'Name': 'string',
                         'Region': 'string'
                     },
                     'CatalogId': 'string',
                     'VersionId': 'string',
                     'FederatedTable': {
                         'Identifier': 'string',
                         'DatabaseIdentifier': 'string',
                         'ConnectionName': 'string',
                         'ConnectionType': 'string'
                     },
                     'ViewDefinition': {
                         'IsProtected': True|False,
                         'Definer': 'string',
                         'SubObjects': [
                             'string',
                         ],
                         'Representations': [
                             {
                                 'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                 'DialectVersion': 'string',
                                 'ViewOriginalText': 'string',
                                 'ViewExpandedText': 'string',
                                 'ValidationConnection': 'string',
                                 'IsStale': True|False
                             },
                         ]
                     },
                     'IsMultiDialectView': True|False,
                     'Status': {
                         'RequestedBy': 'string',
                         'UpdatedBy': 'string',
                         'RequestTime': datetime(2015, 1, 1),
                         'UpdateTime': datetime(2015, 1, 1),
                         'Action': 'UPDATE'|'CREATE',
                         'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                         'Error': {
                             'ErrorCode': 'string',
                             'ErrorMessage': 'string'
                         },
                         'Details': {
                             'RequestedChange': {'... recursive ...'},
                             'ViewValidations': [
                                 {
                                     'Dialect': 'REDSHIFT'|'ATHENA'|'SPARK',
                                     'DialectVersion': 'string',
                                     'ViewValidationText': 'string',
                                     'UpdateTime': datetime(2015, 1, 1),
                                     'State': 'QUEUED'|'IN_PROGRESS'|'SUCCESS'|'STOPPED'|'FAILED',
                                     'Error': {
                                         'ErrorCode': 'string',
                                         'ErrorMessage': 'string'
                                     }
                                 },
                             ]
                         }
                     }
                 },
                 'VersionId': 'string'
             }
         }

      **Response Structure**

      * *(dict) --*

        * **TableVersion** *(dict) --*

          The requested table version.

          * **Table** *(dict) --*

            The table in question.

            * **Name** *(string) --*

              The table name. For Hive compatibility, this must be
              entirely lowercase.

            * **DatabaseName** *(string) --*

              The name of the database where the table metadata
              resides. For Hive compatibility, this must be all
              lowercase.

            * **Description** *(string) --*

              A description of the table.

            * **Owner** *(string) --*

              The owner of the table.

            * **CreateTime** *(datetime) --*

              The time when the table definition was created in the
              Data Catalog.

            * **UpdateTime** *(datetime) --*

              The last time that the table was updated.

            * **LastAccessTime** *(datetime) --*

              The last time that the table was accessed. This is
              usually taken from HDFS, and might not be reliable.

            * **LastAnalyzedTime** *(datetime) --*

              The last time that column statistics were computed for
              this table.

            * **Retention** *(integer) --*

              The retention time for this table.

            * **StorageDescriptor** *(dict) --*

              A storage descriptor containing information about the
              physical storage of this table.

              * **Columns** *(list) --*

                A list of the "Columns" in the table.

                * *(dict) --*

                  A column in a "Table".

                  * **Name** *(string) --*

                    The name of the "Column".

                  * **Type** *(string) --*

                    The data type of the "Column".

                  * **Comment** *(string) --*

                    A free-form text comment.

                  * **Parameters** *(dict) --*

                    These key-value pairs define properties associated
                    with the column.

                    * *(string) --*

                      * *(string) --*

              * **Location** *(string) --*

                The physical location of the table. By default, this
                takes the form of the warehouse location, followed by
                the database location in the warehouse, followed by
                the table name.

              * **AdditionalLocations** *(list) --*

                A list of locations that point to the path where a
                Delta table is located.

                * *(string) --*

              * **InputFormat** *(string) --*

                The input format: "SequenceFileInputFormat" (binary),
                or "TextInputFormat", or a custom format.

              * **OutputFormat** *(string) --*

                The output format: "SequenceFileOutputFormat"
                (binary), or "IgnoreKeyTextOutputFormat", or a custom
                format.

              * **Compressed** *(boolean) --*

                "True" if the data in the table is compressed, or
                "False" if not.

              * **NumberOfBuckets** *(integer) --*

                Must be specified if the table contains any dimension
                columns.

              * **SerdeInfo** *(dict) --*

                The serialization/deserialization (SerDe) information.

                * **Name** *(string) --*

                  Name of the SerDe.

                * **SerializationLibrary** *(string) --*

                  Usually the class that implements the SerDe. An
                  example is "org.apache.hadoop.hive.serde2.columnar.
                  ColumnarSerDe".

                * **Parameters** *(dict) --*

                  These key-value pairs define initialization
                  parameters for the SerDe.

                  * *(string) --*

                    * *(string) --*

              * **BucketColumns** *(list) --*

                A list of reducer grouping columns, clustering
                columns, and bucketing columns in the table.

                * *(string) --*

              * **SortColumns** *(list) --*

                A list specifying the sort order of each bucket in the
                table.

                * *(dict) --*

                  Specifies the sort order of a sorted column.

                  * **Column** *(string) --*

                    The name of the column.

                  * **SortOrder** *(integer) --*

                    Indicates that the column is sorted in ascending
                    order ( "== 1"), or in descending order ( "==0").

              * **Parameters** *(dict) --*

                The user-supplied properties in key-value form.

                * *(string) --*

                  * *(string) --*

              * **SkewedInfo** *(dict) --*

                The information about values that appear frequently in
                a column (skewed values).

                * **SkewedColumnNames** *(list) --*

                  A list of names of columns that contain skewed
                  values.

                  * *(string) --*

                * **SkewedColumnValues** *(list) --*

                  A list of values that appear so frequently as to be
                  considered skewed.

                  * *(string) --*

                * **SkewedColumnValueLocationMaps** *(dict) --*

                  A mapping of skewed values to the columns that
                  contain them.

                  * *(string) --*

                    * *(string) --*

              * **StoredAsSubDirectories** *(boolean) --*

                "True" if the table data is stored in subdirectories,
                or "False" if not.

              * **SchemaReference** *(dict) --*

                An object that references a schema stored in the Glue
                Schema Registry.

                When creating a table, you can pass an empty list of
                columns for the schema, and instead use a schema
                reference.

                * **SchemaId** *(dict) --*

                  A structure that contains schema identity fields.
                  Either this or the "SchemaVersionId" has to be
                  provided.

                  * **SchemaArn** *(string) --*

                    The Amazon Resource Name (ARN) of the schema. One
                    of "SchemaArn" or "SchemaName" has to be provided.

                  * **SchemaName** *(string) --*

                    The name of the schema. One of "SchemaArn" or
                    "SchemaName" has to be provided.

                  * **RegistryName** *(string) --*

                    The name of the schema registry that contains the
                    schema.

                * **SchemaVersionId** *(string) --*

                  The unique ID assigned to a version of the schema.
                  Either this or the "SchemaId" has to be provided.

                * **SchemaVersionNumber** *(integer) --*

                  The version number of the schema.

            * **PartitionKeys** *(list) --*

              A list of columns by which the table is partitioned.
              Only primitive types are supported as partition keys.

              When you create a table used by Amazon Athena, and you
              do not specify any "partitionKeys", you must at least
              set the value of "partitionKeys" to an empty list. For
              example:

              ""PartitionKeys": []"

              * *(dict) --*

                A column in a "Table".

                * **Name** *(string) --*

                  The name of the "Column".

                * **Type** *(string) --*

                  The data type of the "Column".

                * **Comment** *(string) --*

                  A free-form text comment.

                * **Parameters** *(dict) --*

                  These key-value pairs define properties associated
                  with the column.

                  * *(string) --*

                    * *(string) --*

            * **ViewOriginalText** *(string) --*

              Included for Apache Hive compatibility. Not used in the
              normal course of Glue operations. If the table is a
              "VIRTUAL_VIEW", certain Athena configuration encoded in
              base64.

            * **ViewExpandedText** *(string) --*

              Included for Apache Hive compatibility. Not used in the
              normal course of Glue operations.

            * **TableType** *(string) --*

              The type of this table. Glue will create tables with the
              "EXTERNAL_TABLE" type. Other services, such as Athena,
              may create tables with additional table types.

              Glue related table types:

                 EXTERNAL_TABLE

              Hive compatible attribute - indicates a non-Hive managed
              table.

                 GOVERNED

              Used by Lake Formation. The Glue Data Catalog
              understands "GOVERNED".

            * **Parameters** *(dict) --*

              These key-value pairs define properties associated with
              the table.

              * *(string) --*

                * *(string) --*

            * **CreatedBy** *(string) --*

              The person or entity who created the table.

            * **IsRegisteredWithLakeFormation** *(boolean) --*

              Indicates whether the table has been registered with
              Lake Formation.

            * **TargetTable** *(dict) --*

              A "TableIdentifier" structure that describes a target
              table for resource linking.

              * **CatalogId** *(string) --*

                The ID of the Data Catalog in which the table resides.

              * **DatabaseName** *(string) --*

                The name of the catalog database that contains the
                target table.

              * **Name** *(string) --*

                The name of the target table.

              * **Region** *(string) --*

                Region of the target table.

            * **CatalogId** *(string) --*

              The ID of the Data Catalog in which the table resides.

            * **VersionId** *(string) --*

              The ID of the table version.

            * **FederatedTable** *(dict) --*

              A "FederatedTable" structure that references an entity
              outside the Glue Data Catalog.

              * **Identifier** *(string) --*

                A unique identifier for the federated table.

              * **DatabaseIdentifier** *(string) --*

                A unique identifier for the federated database.

              * **ConnectionName** *(string) --*

                The name of the connection to the external metastore.

              * **ConnectionType** *(string) --*

                The type of connection used to access the federated
                table, specifying the protocol or method for
                connecting to the external data source.

            * **ViewDefinition** *(dict) --*

              A structure that contains all the information that
              defines the view, including the dialect or dialects for
              the view, and the query.

              * **IsProtected** *(boolean) --*

                You can set this flag as true to instruct the engine
                not to push user-provided operations into the logical
                plan of the view during query planning. However,
                setting this flag does not guarantee that the engine
                will comply. Refer to the engine's documentation to
                understand the guarantees provided, if any.

              * **Definer** *(string) --*

                The definer of a view in SQL.

              * **SubObjects** *(list) --*

                A list of table Amazon Resource Names (ARNs).

                * *(string) --*

              * **Representations** *(list) --*

                A list of representations.

                * *(dict) --*

                  A structure that contains the dialect of the view,
                  and the query that defines the view.

                  * **Dialect** *(string) --*

                    The dialect of the query engine.

                  * **DialectVersion** *(string) --*

                    The version of the dialect of the query engine.
                    For example, 3.0.0.

                  * **ViewOriginalText** *(string) --*

                    The "SELECT" query provided by the customer during
                    "CREATE VIEW DDL". This SQL is not used during a
                    query on a view ( "ViewExpandedText" is used
                    instead). "ViewOriginalText" is used for cases
                    like "SHOW CREATE VIEW" where users want to see
                    the original DDL command that created the view.

                  * **ViewExpandedText** *(string) --*

                    The expanded SQL for the view. This SQL is used by
                    engines while processing a query on a view.
                    Engines may perform operations during view
                    creation to transform "ViewOriginalText" to
                    "ViewExpandedText". For example:

                    * Fully qualified identifiers: "SELECT * from
                      table1 -> SELECT * from db1.table1"

                  * **ValidationConnection** *(string) --*

                    The name of the connection to be used to validate
                    the specific representation of the view.

                  * **IsStale** *(boolean) --*

                    Dialects marked as stale are no longer valid and
                    must be updated before they can be queried in
                    their respective query engines.

            * **IsMultiDialectView** *(boolean) --*

              Specifies whether the view supports the SQL dialects of
              one or more different query engines and can therefore be
              read by those engines.

            * **Status** *(dict) --*

              A structure containing information about the state of an
              asynchronous change to a table.

              * **RequestedBy** *(string) --*

                The ARN of the user who requested the asynchronous
                change.

              * **UpdatedBy** *(string) --*

                The ARN of the user to last manually alter the
                asynchronous change (requesting cancellation, etc).

              * **RequestTime** *(datetime) --*

                An ISO 8601 formatted date string indicating the time
                that the change was initiated.

              * **UpdateTime** *(datetime) --*

                An ISO 8601 formatted date string indicating the time
                that the state was last updated.

              * **Action** *(string) --*

                Indicates which action was called on the table,
                currently only "CREATE" or "UPDATE".

              * **State** *(string) --*

                A generic status for the change in progress, such as
                QUEUED, IN_PROGRESS, SUCCESS, or FAILED.

              * **Error** *(dict) --*

                An error that will only appear when the state is
                "FAILED". This is a parent level exception message,
                there may be different >>``<<Error``s for each
                dialect.

                * **ErrorCode** *(string) --*

                  The code associated with this error.

                * **ErrorMessage** *(string) --*

                  A message describing the error.

              * **Details** *(dict) --*

                A "StatusDetails" object with information about the
                requested change.

                * **RequestedChange** *(dict) --*

                  A "Table" object representing the requested changes.

                * **ViewValidations** *(list) --*

                  A list of "ViewValidation" objects that contain
                  information for an analytical engine to validate a
                  view.

                  * *(dict) --*

                    A structure that contains information for an
                    analytical engine to validate a view, prior to
                    persisting the view metadata. Used in the case of
                    direct "UpdateTable" or "CreateTable" API calls.

                    * **Dialect** *(string) --*

                      The dialect of the query engine.

                    * **DialectVersion** *(string) --*

                      The version of the dialect of the query engine.
                      For example, 3.0.0.

                    * **ViewValidationText** *(string) --*

                      The "SELECT" query that defines the view, as
                      provided by the customer.

                    * **UpdateTime** *(datetime) --*

                      The time of the last update.

                    * **State** *(string) --*

                      The state of the validation.

                    * **Error** *(dict) --*

                      An error associated with the validation.

                      * **ErrorCode** *(string) --*

                        The code associated with this error.

                      * **ErrorMessage** *(string) --*

                        A message describing the error.

          * **VersionId** *(string) --*

            The ID value that identifies this table version. A
            "VersionId" is a string representation of an integer. Each
            version is incremented by 1.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.GlueEncryptionException"
Glue / Client / list_crawls


list_crawls
***********

Glue.Client.list_crawls(**kwargs)

   Returns all the crawls of a specified crawler. Returns only the
   crawls that have occurred since the launch date of the crawler
   history feature, and only retains up to 12 months of crawls. Older
   crawls will not be returned.

   You may use this API to:

   * Retrive all the crawls of a specified crawler.

   * Retrieve all the crawls of a specified crawler within a limited
     count.

   * Retrieve all the crawls of a specified crawler in a specific time
     range.

   * Retrieve all the crawls of a specified crawler with a particular
     state, crawl ID, or DPU hour value.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.list_crawls(
          CrawlerName='string',
          MaxResults=123,
          Filters=[
              {
                  'FieldName': 'CRAWL_ID'|'STATE'|'START_TIME'|'END_TIME'|'DPU_HOUR',
                  'FilterOperator': 'GT'|'GE'|'LT'|'LE'|'EQ'|'NE',
                  'FieldValue': 'string'
              },
          ],
          NextToken='string'
      )

   Parameters:
      * **CrawlerName** (*string*) --

        **[REQUIRED]**

        The name of the crawler whose runs you want to retrieve.

      * **MaxResults** (*integer*) -- The maximum number of results to
        return. The default is 20, and maximum is 100.

      * **Filters** (*list*) --

        Filters the crawls by the criteria you specify in a list of
        "CrawlsFilter" objects.

        * *(dict) --*

          A list of fields, comparators and value that you can use to
          filter the crawler runs for a specified crawler.

          * **FieldName** *(string) --*

            A key used to filter the crawler runs for a specified
            crawler. Valid values for each of the field names are:

            * "CRAWL_ID": A string representing the UUID identifier
              for a crawl.

            * "STATE": A string representing the state of the crawl.

            * "START_TIME" and "END_TIME": The epoch timestamp in
              milliseconds.

            * "DPU_HOUR": The number of data processing unit (DPU)
              hours used for the crawl.

          * **FilterOperator** *(string) --*

            A defined comparator that operates on the value. The
            available operators are:

            * "GT": Greater than.

            * "GE": Greater than or equal to.

            * "LT": Less than.

            * "LE": Less than or equal to.

            * "EQ": Equal to.

            * "NE": Not equal to.

          * **FieldValue** *(string) --*

            The value provided for comparison on the crawl field.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation call.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Crawls': [
                 {
                     'CrawlId': 'string',
                     'State': 'RUNNING'|'COMPLETED'|'FAILED'|'STOPPED',
                     'StartTime': datetime(2015, 1, 1),
                     'EndTime': datetime(2015, 1, 1),
                     'Summary': 'string',
                     'ErrorMessage': 'string',
                     'LogGroup': 'string',
                     'LogStream': 'string',
                     'MessagePrefix': 'string',
                     'DPUHour': 123.0
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Crawls** *(list) --*

          A list of "CrawlerHistory" objects representing the crawl
          runs that meet your criteria.

          * *(dict) --*

            Contains the information for a run of a crawler.

            * **CrawlId** *(string) --*

              A UUID identifier for each crawl.

            * **State** *(string) --*

              The state of the crawl.

            * **StartTime** *(datetime) --*

              The date and time on which the crawl started.

            * **EndTime** *(datetime) --*

              The date and time on which the crawl ended.

            * **Summary** *(string) --*

              A run summary for the specific crawl in JSON. Contains
              the catalog tables and partitions that were added,
              updated, or deleted.

            * **ErrorMessage** *(string) --*

              If an error occurred, the error message associated with
              the crawl.

            * **LogGroup** *(string) --*

              The log group associated with the crawl.

            * **LogStream** *(string) --*

              The log stream associated with the crawl.

            * **MessagePrefix** *(string) --*

              The prefix for a CloudWatch message about this crawl.

            * **DPUHour** *(float) --*

              The number of data processing units (DPU) used in hours
              for the crawl.

        * **NextToken** *(string) --*

          A continuation token for paginating the returned list of
          tokens, returned if the current segment of the list is not
          the last.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InvalidInputException"
Glue / Client / put_data_catalog_encryption_settings


put_data_catalog_encryption_settings
************************************

Glue.Client.put_data_catalog_encryption_settings(**kwargs)

   Sets the security configuration for a specified catalog. After the
   configuration has been set, the specified encryption is applied to
   every catalog write thereafter.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.put_data_catalog_encryption_settings(
          CatalogId='string',
          DataCatalogEncryptionSettings={
              'EncryptionAtRest': {
                  'CatalogEncryptionMode': 'DISABLED'|'SSE-KMS'|'SSE-KMS-WITH-SERVICE-ROLE',
                  'SseAwsKmsKeyId': 'string',
                  'CatalogEncryptionServiceRole': 'string'
              },
              'ConnectionPasswordEncryption': {
                  'ReturnConnectionPasswordEncrypted': True|False,
                  'AwsKmsKeyId': 'string'
              }
          }
      )

   Parameters:
      * **CatalogId** (*string*) -- The ID of the Data Catalog to set
        the security configuration for. If none is provided, the
        Amazon Web Services account ID is used by default.

      * **DataCatalogEncryptionSettings** (*dict*) --

        **[REQUIRED]**

        The security configuration to set.

        * **EncryptionAtRest** *(dict) --*

          Specifies the encryption-at-rest configuration for the Data
          Catalog.

          * **CatalogEncryptionMode** *(string) --* **[REQUIRED]**

            The encryption-at-rest mode for encrypting Data Catalog
            data.

          * **SseAwsKmsKeyId** *(string) --*

            The ID of the KMS key to use for encryption at rest.

          * **CatalogEncryptionServiceRole** *(string) --*

            The role that Glue assumes to encrypt and decrypt the Data
            Catalog objects on the caller's behalf.

        * **ConnectionPasswordEncryption** *(dict) --*

          When connection password protection is enabled, the Data
          Catalog uses a customer-provided key to encrypt the password
          as part of "CreateConnection" or "UpdateConnection" and
          store it in the "ENCRYPTED_PASSWORD" field in the connection
          properties. You can enable catalog encryption or only
          password encryption.

          * **ReturnConnectionPasswordEncrypted** *(boolean) --*
            **[REQUIRED]**

            When the "ReturnConnectionPasswordEncrypted" flag is set
            to "true", passwords remain encrypted in the responses of
            "GetConnection" and "GetConnections". This encryption
            takes effect independently from catalog encryption.

          * **AwsKmsKeyId** *(string) --*

            An KMS key that is used to encrypt the connection
            password.

            If connection password protection is enabled, the caller
            of "CreateConnection" and "UpdateConnection" needs at
            least "kms:Encrypt" permission on the specified KMS key,
            to encrypt passwords before storing them in the Data
            Catalog.

            You can set the decrypt permission to enable or restrict
            access on the password key according to your security
            requirements.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.InternalServiceException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / get_crawlers


get_crawlers
************

Glue.Client.get_crawlers(**kwargs)

   Retrieves metadata for all crawlers defined in the customer
   account.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_crawlers(
          MaxResults=123,
          NextToken='string'
      )

   Parameters:
      * **MaxResults** (*integer*) -- The number of crawlers to return
        on each call.

      * **NextToken** (*string*) -- A continuation token, if this is a
        continuation request.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'Crawlers': [
                 {
                     'Name': 'string',
                     'Role': 'string',
                     'Targets': {
                         'S3Targets': [
                             {
                                 'Path': 'string',
                                 'Exclusions': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'SampleSize': 123,
                                 'EventQueueArn': 'string',
                                 'DlqEventQueueArn': 'string'
                             },
                         ],
                         'JdbcTargets': [
                             {
                                 'ConnectionName': 'string',
                                 'Path': 'string',
                                 'Exclusions': [
                                     'string',
                                 ],
                                 'EnableAdditionalMetadata': [
                                     'COMMENTS'|'RAWTYPES',
                                 ]
                             },
                         ],
                         'MongoDBTargets': [
                             {
                                 'ConnectionName': 'string',
                                 'Path': 'string',
                                 'ScanAll': True|False
                             },
                         ],
                         'DynamoDBTargets': [
                             {
                                 'Path': 'string',
                                 'scanAll': True|False,
                                 'scanRate': 123.0
                             },
                         ],
                         'CatalogTargets': [
                             {
                                 'DatabaseName': 'string',
                                 'Tables': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'EventQueueArn': 'string',
                                 'DlqEventQueueArn': 'string'
                             },
                         ],
                         'DeltaTargets': [
                             {
                                 'DeltaTables': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'WriteManifest': True|False,
                                 'CreateNativeDeltaTable': True|False
                             },
                         ],
                         'IcebergTargets': [
                             {
                                 'Paths': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'Exclusions': [
                                     'string',
                                 ],
                                 'MaximumTraversalDepth': 123
                             },
                         ],
                         'HudiTargets': [
                             {
                                 'Paths': [
                                     'string',
                                 ],
                                 'ConnectionName': 'string',
                                 'Exclusions': [
                                     'string',
                                 ],
                                 'MaximumTraversalDepth': 123
                             },
                         ]
                     },
                     'DatabaseName': 'string',
                     'Description': 'string',
                     'Classifiers': [
                         'string',
                     ],
                     'RecrawlPolicy': {
                         'RecrawlBehavior': 'CRAWL_EVERYTHING'|'CRAWL_NEW_FOLDERS_ONLY'|'CRAWL_EVENT_MODE'
                     },
                     'SchemaChangePolicy': {
                         'UpdateBehavior': 'LOG'|'UPDATE_IN_DATABASE',
                         'DeleteBehavior': 'LOG'|'DELETE_FROM_DATABASE'|'DEPRECATE_IN_DATABASE'
                     },
                     'LineageConfiguration': {
                         'CrawlerLineageSettings': 'ENABLE'|'DISABLE'
                     },
                     'State': 'READY'|'RUNNING'|'STOPPING',
                     'TablePrefix': 'string',
                     'Schedule': {
                         'ScheduleExpression': 'string',
                         'State': 'SCHEDULED'|'NOT_SCHEDULED'|'TRANSITIONING'
                     },
                     'CrawlElapsedTime': 123,
                     'CreationTime': datetime(2015, 1, 1),
                     'LastUpdated': datetime(2015, 1, 1),
                     'LastCrawl': {
                         'Status': 'SUCCEEDED'|'CANCELLED'|'FAILED',
                         'ErrorMessage': 'string',
                         'LogGroup': 'string',
                         'LogStream': 'string',
                         'MessagePrefix': 'string',
                         'StartTime': datetime(2015, 1, 1)
                     },
                     'Version': 123,
                     'Configuration': 'string',
                     'CrawlerSecurityConfiguration': 'string',
                     'LakeFormationConfiguration': {
                         'UseLakeFormationCredentials': True|False,
                         'AccountId': 'string'
                     }
                 },
             ],
             'NextToken': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **Crawlers** *(list) --*

          A list of crawler metadata.

          * *(dict) --*

            Specifies a crawler program that examines a data source
            and uses classifiers to try to determine its schema. If
            successful, the crawler records metadata concerning the
            data source in the Glue Data Catalog.

            * **Name** *(string) --*

              The name of the crawler.

            * **Role** *(string) --*

              The Amazon Resource Name (ARN) of an IAM role that's
              used to access customer resources, such as Amazon Simple
              Storage Service (Amazon S3) data.

            * **Targets** *(dict) --*

              A collection of targets to crawl.

              * **S3Targets** *(list) --*

                Specifies Amazon Simple Storage Service (Amazon S3)
                targets.

                * *(dict) --*

                  Specifies a data store in Amazon Simple Storage
                  Service (Amazon S3).

                  * **Path** *(string) --*

                    The path to the Amazon S3 target.

                  * **Exclusions** *(list) --*

                    A list of glob patterns used to exclude from the
                    crawl. For more information, see Catalog Tables
                    with a Crawler.

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of a connection which allows a job or
                    crawler to access data in Amazon S3 within an
                    Amazon Virtual Private Cloud environment (Amazon
                    VPC).

                  * **SampleSize** *(integer) --*

                    Sets the number of files in each leaf folder to be
                    crawled when crawling sample files in a dataset.
                    If not set, all the files are crawled. A valid
                    value is an integer between 1 and 249.

                  * **EventQueueArn** *(string) --*

                    A valid Amazon SQS ARN. For example,
                    "arn:aws:sqs:region:account:sqs".

                  * **DlqEventQueueArn** *(string) --*

                    A valid Amazon dead-letter SQS ARN. For example,
                    "arn:aws:sqs:region:account:deadLetterQueue".

              * **JdbcTargets** *(list) --*

                Specifies JDBC targets.

                * *(dict) --*

                  Specifies a JDBC data store to crawl.

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the JDBC target.

                  * **Path** *(string) --*

                    The path of the JDBC target.

                  * **Exclusions** *(list) --*

                    A list of glob patterns used to exclude from the
                    crawl. For more information, see Catalog Tables
                    with a Crawler.

                    * *(string) --*

                  * **EnableAdditionalMetadata** *(list) --*

                    Specify a value of "RAWTYPES" or "COMMENTS" to
                    enable additional metadata in table responses.
                    "RAWTYPES" provides the native-level datatype.
                    "COMMENTS" provides comments associated with a
                    column or table in the database.

                    If you do not need additional metadata, keep the
                    field empty.

                    * *(string) --*

              * **MongoDBTargets** *(list) --*

                Specifies Amazon DocumentDB or MongoDB targets.

                * *(dict) --*

                  Specifies an Amazon DocumentDB or MongoDB data store
                  to crawl.

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the Amazon DocumentDB or MongoDB target.

                  * **Path** *(string) --*

                    The path of the Amazon DocumentDB or MongoDB
                    target (database/collection).

                  * **ScanAll** *(boolean) --*

                    Indicates whether to scan all the records, or to
                    sample rows from the table. Scanning all the
                    records can take a long time when the table is not
                    a high throughput table.

                    A value of "true" means to scan all records, while
                    a value of "false" means to sample the records. If
                    no value is specified, the value defaults to
                    "true".

              * **DynamoDBTargets** *(list) --*

                Specifies Amazon DynamoDB targets.

                * *(dict) --*

                  Specifies an Amazon DynamoDB table to crawl.

                  * **Path** *(string) --*

                    The name of the DynamoDB table to crawl.

                  * **scanAll** *(boolean) --*

                    Indicates whether to scan all the records, or to
                    sample rows from the table. Scanning all the
                    records can take a long time when the table is not
                    a high throughput table.

                    A value of "true" means to scan all records, while
                    a value of "false" means to sample the records. If
                    no value is specified, the value defaults to
                    "true".

                  * **scanRate** *(float) --*

                    The percentage of the configured read capacity
                    units to use by the Glue crawler. Read capacity
                    units is a term defined by DynamoDB, and is a
                    numeric value that acts as rate limiter for the
                    number of reads that can be performed on that
                    table per second.

                    The valid values are null or a value between 0.1
                    to 1.5. A null value is used when user does not
                    provide a value, and defaults to 0.5 of the
                    configured Read Capacity Unit (for provisioned
                    tables), or 0.25 of the max configured Read
                    Capacity Unit (for tables using on-demand mode).

              * **CatalogTargets** *(list) --*

                Specifies Glue Data Catalog targets.

                * *(dict) --*

                  Specifies an Glue Data Catalog target.

                  * **DatabaseName** *(string) --*

                    The name of the database to be synchronized.

                  * **Tables** *(list) --*

                    A list of the tables to be synchronized.

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of the connection for an Amazon S3-backed
                    Data Catalog table to be a target of the crawl
                    when using a "Catalog" connection type paired with
                    a "NETWORK" Connection type.

                  * **EventQueueArn** *(string) --*

                    A valid Amazon SQS ARN. For example,
                    "arn:aws:sqs:region:account:sqs".

                  * **DlqEventQueueArn** *(string) --*

                    A valid Amazon dead-letter SQS ARN. For example,
                    "arn:aws:sqs:region:account:deadLetterQueue".

              * **DeltaTargets** *(list) --*

                Specifies Delta data store targets.

                * *(dict) --*

                  Specifies a Delta data store to crawl one or more
                  Delta tables.

                  * **DeltaTables** *(list) --*

                    A list of the Amazon S3 paths to the Delta tables.

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the Delta table target.

                  * **WriteManifest** *(boolean) --*

                    Specifies whether to write the manifest files to
                    the Delta table path.

                  * **CreateNativeDeltaTable** *(boolean) --*

                    Specifies whether the crawler will create native
                    tables, to allow integration with query engines
                    that support querying of the Delta transaction log
                    directly.

              * **IcebergTargets** *(list) --*

                Specifies Apache Iceberg data store targets.

                * *(dict) --*

                  Specifies an Apache Iceberg data source where
                  Iceberg tables are stored in Amazon S3.

                  * **Paths** *(list) --*

                    One or more Amazon S3 paths that contains Iceberg
                    metadata folders as "s3://bucket/prefix".

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the Iceberg target.

                  * **Exclusions** *(list) --*

                    A list of glob patterns used to exclude from the
                    crawl. For more information, see Catalog Tables
                    with a Crawler.

                    * *(string) --*

                  * **MaximumTraversalDepth** *(integer) --*

                    The maximum depth of Amazon S3 paths that the
                    crawler can traverse to discover the Iceberg
                    metadata folder in your Amazon S3 path. Used to
                    limit the crawler run time.

              * **HudiTargets** *(list) --*

                Specifies Apache Hudi data store targets.

                * *(dict) --*

                  Specifies an Apache Hudi data source.

                  * **Paths** *(list) --*

                    An array of Amazon S3 location strings for Hudi,
                    each indicating the root folder with which the
                    metadata files for a Hudi table resides. The Hudi
                    folder may be located in a child folder of the
                    root folder.

                    The crawler will scan all folders underneath a
                    path for a Hudi folder.

                    * *(string) --*

                  * **ConnectionName** *(string) --*

                    The name of the connection to use to connect to
                    the Hudi target. If your Hudi files are stored in
                    buckets that require VPC authorization, you can
                    set their connection properties here.

                  * **Exclusions** *(list) --*

                    A list of glob patterns used to exclude from the
                    crawl. For more information, see Catalog Tables
                    with a Crawler.

                    * *(string) --*

                  * **MaximumTraversalDepth** *(integer) --*

                    The maximum depth of Amazon S3 paths that the
                    crawler can traverse to discover the Hudi metadata
                    folder in your Amazon S3 path. Used to limit the
                    crawler run time.

            * **DatabaseName** *(string) --*

              The name of the database in which the crawler's output
              is stored.

            * **Description** *(string) --*

              A description of the crawler.

            * **Classifiers** *(list) --*

              A list of UTF-8 strings that specify the custom
              classifiers that are associated with the crawler.

              * *(string) --*

            * **RecrawlPolicy** *(dict) --*

              A policy that specifies whether to crawl the entire
              dataset again, or to crawl only folders that were added
              since the last crawler run.

              * **RecrawlBehavior** *(string) --*

                Specifies whether to crawl the entire dataset again or
                to crawl only folders that were added since the last
                crawler run.

                A value of "CRAWL_EVERYTHING" specifies crawling the
                entire dataset again.

                A value of "CRAWL_NEW_FOLDERS_ONLY" specifies crawling
                only folders that were added since the last crawler
                run.

                A value of "CRAWL_EVENT_MODE" specifies crawling only
                the changes identified by Amazon S3 events.

            * **SchemaChangePolicy** *(dict) --*

              The policy that specifies update and delete behaviors
              for the crawler.

              * **UpdateBehavior** *(string) --*

                The update behavior when the crawler finds a changed
                schema.

              * **DeleteBehavior** *(string) --*

                The deletion behavior when the crawler finds a deleted
                object.

            * **LineageConfiguration** *(dict) --*

              A configuration that specifies whether data lineage is
              enabled for the crawler.

              * **CrawlerLineageSettings** *(string) --*

                Specifies whether data lineage is enabled for the
                crawler. Valid values are:

                * ENABLE: enables data lineage for the crawler

                * DISABLE: disables data lineage for the crawler

            * **State** *(string) --*

              Indicates whether the crawler is running, or whether a
              run is pending.

            * **TablePrefix** *(string) --*

              The prefix added to the names of tables that are
              created.

            * **Schedule** *(dict) --*

              For scheduled crawlers, the schedule when the crawler
              runs.

              * **ScheduleExpression** *(string) --*

                A "cron" expression used to specify the schedule (see
                Time-Based Schedules for Jobs and Crawlers. For
                example, to run something every day at 12:15 UTC, you
                would specify: "cron(15 12 * * ? *)".

              * **State** *(string) --*

                The state of the schedule.

            * **CrawlElapsedTime** *(integer) --*

              If the crawler is running, contains the total time
              elapsed since the last crawl began.

            * **CreationTime** *(datetime) --*

              The time that the crawler was created.

            * **LastUpdated** *(datetime) --*

              The time that the crawler was last updated.

            * **LastCrawl** *(dict) --*

              The status of the last crawl, and potentially error
              information if an error occurred.

              * **Status** *(string) --*

                Status of the last crawl.

              * **ErrorMessage** *(string) --*

                If an error occurred, the error information about the
                last crawl.

              * **LogGroup** *(string) --*

                The log group for the last crawl.

              * **LogStream** *(string) --*

                The log stream for the last crawl.

              * **MessagePrefix** *(string) --*

                The prefix for a message about this crawl.

              * **StartTime** *(datetime) --*

                The time at which the crawl started.

            * **Version** *(integer) --*

              The version of the crawler.

            * **Configuration** *(string) --*

              Crawler configuration information. This versioned JSON
              string allows users to specify aspects of a crawler's
              behavior. For more information, see Setting crawler
              configuration options.

            * **CrawlerSecurityConfiguration** *(string) --*

              The name of the "SecurityConfiguration" structure to be
              used by this crawler.

            * **LakeFormationConfiguration** *(dict) --*

              Specifies whether the crawler should use Lake Formation
              credentials for the crawler instead of the IAM role
              credentials.

              * **UseLakeFormationCredentials** *(boolean) --*

                Specifies whether to use Lake Formation credentials
                for the crawler instead of the IAM role credentials.

              * **AccountId** *(string) --*

                Required for cross account crawls. For same account
                crawls as the target data, this can be left as null.

        * **NextToken** *(string) --*

          A continuation token, if the returned list has not reached
          the end of those defined in this customer account.

   **Exceptions**

   * "Glue.Client.exceptions.OperationTimeoutException"
Glue / Client / start_import_labels_task_run


start_import_labels_task_run
****************************

Glue.Client.start_import_labels_task_run(**kwargs)

   Enables you to provide additional labels (examples of truth) to be
   used to teach the machine learning transform and improve its
   quality. This API operation is generally used as part of the active
   learning workflow that starts with the
   "StartMLLabelingSetGenerationTaskRun" call and that ultimately
   results in improving the quality of your machine learning
   transform.

   After the "StartMLLabelingSetGenerationTaskRun" finishes, Glue
   machine learning will have generated a series of questions for
   humans to answer. (Answering these questions is often called
   'labeling' in the machine learning workflows). In the case of the
   "FindMatches" transform, these questions are of the form, “What is
   the correct way to group these rows together into groups composed
   entirely of matching records?” After the labeling process is
   finished, users upload their answers/labels with a call to
   "StartImportLabelsTaskRun". After "StartImportLabelsTaskRun"
   finishes, all future runs of the machine learning transform use the
   new and improved labels and perform a higher-quality
   transformation.

   By default, "StartMLLabelingSetGenerationTaskRun" continually
   learns from and combines all labels that you upload unless you set
   "Replace" to true. If you set "Replace" to true,
   "StartImportLabelsTaskRun" deletes and forgets all previously
   uploaded labels and learns only from the exact set that you upload.
   Replacing labels can be helpful if you realize that you previously
   uploaded incorrect labels, and you believe that they are having a
   negative effect on your transform quality.

   You can check on the status of your task run by calling the
   "GetMLTaskRun" operation.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_import_labels_task_run(
          TransformId='string',
          InputS3Path='string',
          ReplaceAllLabels=True|False
      )

   Parameters:
      * **TransformId** (*string*) --

        **[REQUIRED]**

        The unique identifier of the machine learning transform.

      * **InputS3Path** (*string*) --

        **[REQUIRED]**

        The Amazon Simple Storage Service (Amazon S3) path from where
        you import the labels.

      * **ReplaceAllLabels** (*boolean*) -- Indicates whether to
        overwrite your existing labels.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'TaskRunId': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **TaskRunId** *(string) --*

          The unique identifier for the task run.

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.ResourceNumberLimitExceededException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / cancel_data_quality_rule_recommendation_run


cancel_data_quality_rule_recommendation_run
*******************************************

Glue.Client.cancel_data_quality_rule_recommendation_run(**kwargs)

   Cancels the specified recommendation run that was being used to
   generate rules.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.cancel_data_quality_rule_recommendation_run(
          RunId='string'
      )

   Parameters:
      **RunId** (*string*) --

      **[REQUIRED]**

      The unique run identifier associated with this run.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / get_schema_by_definition


get_schema_by_definition
************************

Glue.Client.get_schema_by_definition(**kwargs)

   Retrieves a schema by the "SchemaDefinition". The schema definition
   is sent to the Schema Registry, canonicalized, and hashed. If the
   hash is matched within the scope of the "SchemaName" or ARN (or the
   default registry, if none is supplied), that schema’s metadata is
   returned. Otherwise, a 404 or NotFound error is returned. Schema
   versions in "Deleted" statuses will not be included in the results.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.get_schema_by_definition(
          SchemaId={
              'SchemaArn': 'string',
              'SchemaName': 'string',
              'RegistryName': 'string'
          },
          SchemaDefinition='string'
      )

   Parameters:
      * **SchemaId** (*dict*) --

        **[REQUIRED]**

        This is a wrapper structure to contain schema identity fields.
        The structure contains:

        * SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the
          schema. One of "SchemaArn" or "SchemaName" has to be
          provided.

        * SchemaId$SchemaName: The name of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema. One of
          "SchemaArn" or "SchemaName" has to be provided.

        * **SchemaName** *(string) --*

          The name of the schema. One of "SchemaArn" or "SchemaName"
          has to be provided.

        * **RegistryName** *(string) --*

          The name of the schema registry that contains the schema.

      * **SchemaDefinition** (*string*) --

        **[REQUIRED]**

        The definition of the schema for which schema details are
        required.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {
             'SchemaVersionId': 'string',
             'SchemaArn': 'string',
             'DataFormat': 'AVRO'|'JSON'|'PROTOBUF',
             'Status': 'AVAILABLE'|'PENDING'|'FAILURE'|'DELETING',
             'CreatedTime': 'string'
         }

      **Response Structure**

      * *(dict) --*

        * **SchemaVersionId** *(string) --*

          The schema ID of the schema version.

        * **SchemaArn** *(string) --*

          The Amazon Resource Name (ARN) of the schema.

        * **DataFormat** *(string) --*

          The data format of the schema definition. Currently "AVRO",
          "JSON" and "PROTOBUF" are supported.

        * **Status** *(string) --*

          The status of the schema version.

        * **CreatedTime** *(string) --*

          The date and time the schema was created.

   **Exceptions**

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InternalServiceException"
Glue / Client / start_column_statistics_task_run_schedule


start_column_statistics_task_run_schedule
*****************************************

Glue.Client.start_column_statistics_task_run_schedule(**kwargs)

   Starts a column statistics task run schedule.

   See also: AWS API Documentation

   **Request Syntax**

      response = client.start_column_statistics_task_run_schedule(
          DatabaseName='string',
          TableName='string'
      )

   Parameters:
      * **DatabaseName** (*string*) --

        **[REQUIRED]**

        The name of the database where the table resides.

      * **TableName** (*string*) --

        **[REQUIRED]**

        The name of the table for which to start a column statistic
        task run schedule.

   Return type:
      dict

   Returns:
      **Response Syntax**

         {}

      **Response Structure**

      * *(dict) --*

   **Exceptions**

   * "Glue.Client.exceptions.AccessDeniedException"

   * "Glue.Client.exceptions.EntityNotFoundException"

   * "Glue.Client.exceptions.InvalidInputException"

   * "Glue.Client.exceptions.OperationTimeoutException"