amazonka-glue-2.0: Amazon Glue SDK.
Copyright(c) 2013-2023 Brendan Hay
LicenseMozilla Public License, v. 2.0.
MaintainerBrendan Hay
Stabilityauto-generated
Portabilitynon-portable (GHC extensions)
Safe HaskellSafe-Inferred
LanguageHaskell2010

Amazonka.Glue.Lens

Contents

Description

 
Synopsis

Operations

BatchCreatePartition

batchCreatePartition_catalogId :: Lens' BatchCreatePartition (Maybe Text) Source #

The ID of the catalog in which the partition is to be created. Currently, this should be the Amazon Web Services account ID.

batchCreatePartition_databaseName :: Lens' BatchCreatePartition Text Source #

The name of the metadata database in which the partition is to be created.

batchCreatePartition_tableName :: Lens' BatchCreatePartition Text Source #

The name of the metadata table in which the partition is to be created.

batchCreatePartition_partitionInputList :: Lens' BatchCreatePartition [PartitionInput] Source #

A list of PartitionInput structures that define the partitions to be created.

batchCreatePartitionResponse_errors :: Lens' BatchCreatePartitionResponse (Maybe [PartitionError]) Source #

The errors encountered when trying to create the requested partitions.

BatchDeleteConnection

batchDeleteConnection_catalogId :: Lens' BatchDeleteConnection (Maybe Text) Source #

The ID of the Data Catalog in which the connections reside. If none is provided, the Amazon Web Services account ID is used by default.

batchDeleteConnection_connectionNameList :: Lens' BatchDeleteConnection [Text] Source #

A list of names of the connections to delete.

batchDeleteConnectionResponse_errors :: Lens' BatchDeleteConnectionResponse (Maybe (HashMap Text ErrorDetail)) Source #

A map of the names of connections that were not successfully deleted to error details.

batchDeleteConnectionResponse_succeeded :: Lens' BatchDeleteConnectionResponse (Maybe [Text]) Source #

A list of names of the connection definitions that were successfully deleted.

BatchDeletePartition

batchDeletePartition_catalogId :: Lens' BatchDeletePartition (Maybe Text) Source #

The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the Amazon Web Services account ID is used by default.

batchDeletePartition_databaseName :: Lens' BatchDeletePartition Text Source #

The name of the catalog database in which the table in question resides.

batchDeletePartition_tableName :: Lens' BatchDeletePartition Text Source #

The name of the table that contains the partitions to be deleted.

batchDeletePartition_partitionsToDelete :: Lens' BatchDeletePartition [PartitionValueList] Source #

A list of PartitionInput structures that define the partitions to be deleted.

batchDeletePartitionResponse_errors :: Lens' BatchDeletePartitionResponse (Maybe [PartitionError]) Source #

The errors encountered when trying to delete the requested partitions.

BatchDeleteTable

batchDeleteTable_catalogId :: Lens' BatchDeleteTable (Maybe Text) Source #

The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default.

batchDeleteTable_transactionId :: Lens' BatchDeleteTable (Maybe Text) Source #

The transaction ID at which to delete the table contents.

batchDeleteTable_databaseName :: Lens' BatchDeleteTable Text Source #

The name of the catalog database in which the tables to delete reside. For Hive compatibility, this name is entirely lowercase.

batchDeleteTableResponse_errors :: Lens' BatchDeleteTableResponse (Maybe [TableError]) Source #

A list of errors encountered in attempting to delete the specified tables.

BatchDeleteTableVersion

batchDeleteTableVersion_catalogId :: Lens' BatchDeleteTableVersion (Maybe Text) Source #

The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.

batchDeleteTableVersion_databaseName :: Lens' BatchDeleteTableVersion Text Source #

The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.

batchDeleteTableVersion_tableName :: Lens' BatchDeleteTableVersion Text Source #

The name of the table. For Hive compatibility, this name is entirely lowercase.

batchDeleteTableVersion_versionIds :: Lens' BatchDeleteTableVersion [Text] Source #

A list of the IDs of versions to be deleted. A VersionId is a string representation of an integer. Each version is incremented by 1.

batchDeleteTableVersionResponse_errors :: Lens' BatchDeleteTableVersionResponse (Maybe [TableVersionError]) Source #

A list of errors encountered while trying to delete the specified table versions.

BatchGetBlueprints

batchGetBlueprints_includeBlueprint :: Lens' BatchGetBlueprints (Maybe Bool) Source #

Specifies whether or not to include the blueprint in the response.

batchGetBlueprints_includeParameterSpec :: Lens' BatchGetBlueprints (Maybe Bool) Source #

Specifies whether or not to include the parameters, as a JSON string, for the blueprint in the response.

batchGetBlueprintsResponse_blueprints :: Lens' BatchGetBlueprintsResponse (Maybe [Blueprint]) Source #

Returns a list of blueprint as a Blueprints object.

batchGetBlueprintsResponse_missingBlueprints :: Lens' BatchGetBlueprintsResponse (Maybe [Text]) Source #

Returns a list of BlueprintNames that were not found.

BatchGetCrawlers

batchGetCrawlers_crawlerNames :: Lens' BatchGetCrawlers [Text] Source #

A list of crawler names, which might be the names returned from the ListCrawlers operation.

batchGetCrawlersResponse_crawlersNotFound :: Lens' BatchGetCrawlersResponse (Maybe [Text]) Source #

A list of names of crawlers that were not found.

BatchGetCustomEntityTypes

batchGetCustomEntityTypes_names :: Lens' BatchGetCustomEntityTypes (NonEmpty Text) Source #

A list of names of the custom patterns that you want to retrieve.

batchGetCustomEntityTypesResponse_customEntityTypes :: Lens' BatchGetCustomEntityTypesResponse (Maybe [CustomEntityType]) Source #

A list of CustomEntityType objects representing the custom patterns that have been created.

BatchGetDataQualityResult

batchGetDataQualityResult_resultIds :: Lens' BatchGetDataQualityResult (NonEmpty Text) Source #

A list of unique result IDs for the data quality results.

batchGetDataQualityResultResponse_results :: Lens' BatchGetDataQualityResultResponse [DataQualityResult] Source #

A list of DataQualityResult objects representing the data quality results.

BatchGetDevEndpoints

batchGetDevEndpoints_devEndpointNames :: Lens' BatchGetDevEndpoints (NonEmpty Text) Source #

The list of DevEndpoint names, which might be the names returned from the ListDevEndpoint operation.

BatchGetJobs

batchGetJobs_jobNames :: Lens' BatchGetJobs [Text] Source #

A list of job names, which might be the names returned from the ListJobs operation.

BatchGetPartition

batchGetPartition_catalogId :: Lens' BatchGetPartition (Maybe Text) Source #

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

batchGetPartition_databaseName :: Lens' BatchGetPartition Text Source #

The name of the catalog database where the partitions reside.

batchGetPartition_tableName :: Lens' BatchGetPartition Text Source #

The name of the partitions' table.

batchGetPartition_partitionsToGet :: Lens' BatchGetPartition [PartitionValueList] Source #

A list of partition values identifying the partitions to retrieve.

batchGetPartitionResponse_unprocessedKeys :: Lens' BatchGetPartitionResponse (Maybe [PartitionValueList]) Source #

A list of the partition values in the request for which partitions were not returned.

BatchGetTriggers

batchGetTriggers_triggerNames :: Lens' BatchGetTriggers [Text] Source #

A list of trigger names, which may be the names returned from the ListTriggers operation.

BatchGetWorkflows

batchGetWorkflows_includeGraph :: Lens' BatchGetWorkflows (Maybe Bool) Source #

Specifies whether to include a graph when returning the workflow resource metadata.

batchGetWorkflows_names :: Lens' BatchGetWorkflows (NonEmpty Text) Source #

A list of workflow names, which may be the names returned from the ListWorkflows operation.

BatchStopJobRun

batchStopJobRun_jobName :: Lens' BatchStopJobRun Text Source #

The name of the job definition for which to stop job runs.

batchStopJobRun_jobRunIds :: Lens' BatchStopJobRun (NonEmpty Text) Source #

A list of the JobRunIds that should be stopped for that job definition.

batchStopJobRunResponse_errors :: Lens' BatchStopJobRunResponse (Maybe [BatchStopJobRunError]) Source #

A list of the errors that were encountered in trying to stop JobRuns, including the JobRunId for which each error was encountered and details about the error.

batchStopJobRunResponse_successfulSubmissions :: Lens' BatchStopJobRunResponse (Maybe [BatchStopJobRunSuccessfulSubmission]) Source #

A list of the JobRuns that were successfully submitted for stopping.

BatchUpdatePartition

batchUpdatePartition_catalogId :: Lens' BatchUpdatePartition (Maybe Text) Source #

The ID of the catalog in which the partition is to be updated. Currently, this should be the Amazon Web Services account ID.

batchUpdatePartition_databaseName :: Lens' BatchUpdatePartition Text Source #

The name of the metadata database in which the partition is to be updated.

batchUpdatePartition_tableName :: Lens' BatchUpdatePartition Text Source #

The name of the metadata table in which the partition is to be updated.

batchUpdatePartition_entries :: Lens' BatchUpdatePartition (NonEmpty BatchUpdatePartitionRequestEntry) Source #

A list of up to 100 BatchUpdatePartitionRequestEntry objects to update.

batchUpdatePartitionResponse_errors :: Lens' BatchUpdatePartitionResponse (Maybe [BatchUpdatePartitionFailureEntry]) Source #

The errors encountered when trying to update the requested partitions. A list of BatchUpdatePartitionFailureEntry objects.

CancelDataQualityRuleRecommendationRun

CancelDataQualityRulesetEvaluationRun

CancelMLTaskRun

cancelMLTaskRun_transformId :: Lens' CancelMLTaskRun Text Source #

The unique identifier of the machine learning transform.

cancelMLTaskRun_taskRunId :: Lens' CancelMLTaskRun Text Source #

A unique identifier for the task run.

cancelMLTaskRunResponse_transformId :: Lens' CancelMLTaskRunResponse (Maybe Text) Source #

The unique identifier of the machine learning transform.

CancelStatement

cancelStatement_requestOrigin :: Lens' CancelStatement (Maybe Text) Source #

The origin of the request to cancel the statement.

cancelStatement_sessionId :: Lens' CancelStatement Text Source #

The Session ID of the statement to be cancelled.

cancelStatement_id :: Lens' CancelStatement Int Source #

The ID of the statement to be cancelled.

CheckSchemaVersionValidity

checkSchemaVersionValidity_dataFormat :: Lens' CheckSchemaVersionValidity DataFormat Source #

The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.

checkSchemaVersionValidity_schemaDefinition :: Lens' CheckSchemaVersionValidity Text Source #

The definition of the schema that has to be validated.

checkSchemaVersionValidityResponse_valid :: Lens' CheckSchemaVersionValidityResponse (Maybe Bool) Source #

Return true, if the schema is valid and false otherwise.

CreateBlueprint

createBlueprint_description :: Lens' CreateBlueprint (Maybe Text) Source #

A description of the blueprint.

createBlueprint_tags :: Lens' CreateBlueprint (Maybe (HashMap Text Text)) Source #

The tags to be applied to this blueprint.

createBlueprint_name :: Lens' CreateBlueprint Text Source #

The name of the blueprint.

createBlueprint_blueprintLocation :: Lens' CreateBlueprint Text Source #

Specifies a path in Amazon S3 where the blueprint is published.

createBlueprintResponse_name :: Lens' CreateBlueprintResponse (Maybe Text) Source #

Returns the name of the blueprint that was registered.

CreateClassifier

createClassifier_csvClassifier :: Lens' CreateClassifier (Maybe CreateCsvClassifierRequest) Source #

A CsvClassifier object specifying the classifier to create.

createClassifier_grokClassifier :: Lens' CreateClassifier (Maybe CreateGrokClassifierRequest) Source #

A GrokClassifier object specifying the classifier to create.

createClassifier_jsonClassifier :: Lens' CreateClassifier (Maybe CreateJsonClassifierRequest) Source #

A JsonClassifier object specifying the classifier to create.

createClassifier_xMLClassifier :: Lens' CreateClassifier (Maybe CreateXMLClassifierRequest) Source #

An XMLClassifier object specifying the classifier to create.

CreateConnection

createConnection_catalogId :: Lens' CreateConnection (Maybe Text) Source #

The ID of the Data Catalog in which to create the connection. If none is provided, the Amazon Web Services account ID is used by default.

createConnection_tags :: Lens' CreateConnection (Maybe (HashMap Text Text)) Source #

The tags you assign to the connection.

createConnection_connectionInput :: Lens' CreateConnection ConnectionInput Source #

A ConnectionInput object defining the connection to create.

CreateCrawler

createCrawler_classifiers :: Lens' CreateCrawler (Maybe [Text]) Source #

A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.

createCrawler_configuration :: Lens' CreateCrawler (Maybe Text) Source #

Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options.

createCrawler_crawlerSecurityConfiguration :: Lens' CreateCrawler (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used by this crawler.

createCrawler_databaseName :: Lens' CreateCrawler (Maybe Text) Source #

The Glue database where results are written, such as: arn:aws:daylight:us-east-1::database/sometable/*.

createCrawler_description :: Lens' CreateCrawler (Maybe Text) Source #

A description of the new crawler.

createCrawler_lakeFormationConfiguration :: Lens' CreateCrawler (Maybe LakeFormationConfiguration) Source #

Specifies Lake Formation configuration settings for the crawler.

createCrawler_lineageConfiguration :: Lens' CreateCrawler (Maybe LineageConfiguration) Source #

Specifies data lineage configuration settings for the crawler.

createCrawler_recrawlPolicy :: Lens' CreateCrawler (Maybe RecrawlPolicy) Source #

A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.

createCrawler_schedule :: Lens' CreateCrawler (Maybe Text) Source #

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

createCrawler_schemaChangePolicy :: Lens' CreateCrawler (Maybe SchemaChangePolicy) Source #

The policy for the crawler's update and deletion behavior.

createCrawler_tablePrefix :: Lens' CreateCrawler (Maybe Text) Source #

The table prefix used for catalog tables that are created.

createCrawler_tags :: Lens' CreateCrawler (Maybe (HashMap Text Text)) Source #

The tags to use with this crawler request. You may use tags to limit access to the crawler. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

createCrawler_name :: Lens' CreateCrawler Text Source #

Name of the new crawler.

createCrawler_role :: Lens' CreateCrawler Text Source #

The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.

createCrawler_targets :: Lens' CreateCrawler CrawlerTargets Source #

A list of collection of targets to crawl.

CreateCustomEntityType

createCustomEntityType_contextWords :: Lens' CreateCustomEntityType (Maybe (NonEmpty Text)) Source #

A list of context words. If none of these context words are found within the vicinity of the regular expression the data will not be detected as sensitive data.

If no context words are passed only a regular expression is checked.

createCustomEntityType_name :: Lens' CreateCustomEntityType Text Source #

A name for the custom pattern that allows it to be retrieved or deleted later. This name must be unique per Amazon Web Services account.

createCustomEntityType_regexString :: Lens' CreateCustomEntityType Text Source #

A regular expression string that is used for detecting sensitive data in a custom pattern.

CreateDataQualityRuleset

createDataQualityRuleset_clientToken :: Lens' CreateDataQualityRuleset (Maybe Text) Source #

Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.

createDataQualityRuleset_tags :: Lens' CreateDataQualityRuleset (Maybe (HashMap Text Text)) Source #

A list of tags applied to the data quality ruleset.

createDataQualityRuleset_targetTable :: Lens' CreateDataQualityRuleset (Maybe DataQualityTargetTable) Source #

A target table associated with the data quality ruleset.

createDataQualityRuleset_name :: Lens' CreateDataQualityRuleset Text Source #

A unique name for the data quality ruleset.

createDataQualityRuleset_ruleset :: Lens' CreateDataQualityRuleset Text Source #

A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.

CreateDatabase

createDatabase_catalogId :: Lens' CreateDatabase (Maybe Text) Source #

The ID of the Data Catalog in which to create the database. If none is provided, the Amazon Web Services account ID is used by default.

createDatabase_tags :: Lens' CreateDatabase (Maybe (HashMap Text Text)) Source #

The tags you assign to the database.

CreateDevEndpoint

createDevEndpoint_arguments :: Lens' CreateDevEndpoint (Maybe (HashMap Text Text)) Source #

A map of arguments used to configure the DevEndpoint.

createDevEndpoint_extraJarsS3Path :: Lens' CreateDevEndpoint (Maybe Text) Source #

The path to one or more Java .jar files in an S3 bucket that should be loaded in your DevEndpoint.

createDevEndpoint_extraPythonLibsS3Path :: Lens' CreateDevEndpoint (Maybe Text) Source #

The paths to one or more Python libraries in an Amazon S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma.

You can only use pure Python libraries with a DevEndpoint. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not yet supported.

createDevEndpoint_glueVersion :: Lens' CreateDevEndpoint (Maybe Text) Source #

Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for running your ETL scripts on development endpoints.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

Development endpoints that are created without specifying a Glue version default to Glue 0.9.

You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2.

createDevEndpoint_numberOfNodes :: Lens' CreateDevEndpoint (Maybe Int) Source #

The number of Glue Data Processing Units (DPUs) to allocate to this DevEndpoint.

createDevEndpoint_numberOfWorkers :: Lens' CreateDevEndpoint (Maybe Int) Source #

The number of workers of a defined workerType that are allocated to the development endpoint.

The maximum number of workers you can define are 299 for G.1X, and 149 for G.2X.

createDevEndpoint_publicKey :: Lens' CreateDevEndpoint (Maybe Text) Source #

The public key to be used by this DevEndpoint for authentication. This attribute is provided for backward compatibility because the recommended attribute to use is public keys.

createDevEndpoint_publicKeys :: Lens' CreateDevEndpoint (Maybe [Text]) Source #

A list of public keys to be used by the development endpoints for authentication. The use of this attribute is preferred over a single public key because the public keys allow you to have a different private key per client.

If you previously created an endpoint with a public key, you must remove that key to be able to set a list of public keys. Call the UpdateDevEndpoint API with the public key content in the deletePublicKeys attribute, and the list of new keys in the addPublicKeys attribute.

createDevEndpoint_securityConfiguration :: Lens' CreateDevEndpoint (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with this DevEndpoint.

createDevEndpoint_securityGroupIds :: Lens' CreateDevEndpoint (Maybe [Text]) Source #

Security group IDs for the security groups to be used by the new DevEndpoint.

createDevEndpoint_subnetId :: Lens' CreateDevEndpoint (Maybe Text) Source #

The subnet ID for the new DevEndpoint to use.

createDevEndpoint_tags :: Lens' CreateDevEndpoint (Maybe (HashMap Text Text)) Source #

The tags to use with this DevEndpoint. You may use tags to limit access to the DevEndpoint. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

createDevEndpoint_workerType :: Lens' CreateDevEndpoint (Maybe WorkerType) Source #

The type of predefined worker that is allocated to the development endpoint. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.

Known issue: when a development endpoint is created with the G.2X WorkerType configuration, the Spark drivers for the development endpoint will run on 4 vCPU, 16 GB of memory, and a 64 GB disk.

createDevEndpoint_endpointName :: Lens' CreateDevEndpoint Text Source #

The name to be assigned to the new DevEndpoint.

createDevEndpoint_roleArn :: Lens' CreateDevEndpoint Text Source #

The IAM role for the DevEndpoint.

createDevEndpointResponse_arguments :: Lens' CreateDevEndpointResponse (Maybe (HashMap Text Text)) Source #

The map of arguments used to configure this DevEndpoint.

Valid arguments are:

  • "--enable-glue-datacatalog": ""

You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2.

createDevEndpointResponse_availabilityZone :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

The Amazon Web Services Availability Zone where this DevEndpoint is located.

createDevEndpointResponse_createdTimestamp :: Lens' CreateDevEndpointResponse (Maybe UTCTime) Source #

The point in time at which this DevEndpoint was created.

createDevEndpointResponse_extraJarsS3Path :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

Path to one or more Java .jar files in an S3 bucket that will be loaded in your DevEndpoint.

createDevEndpointResponse_extraPythonLibsS3Path :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

The paths to one or more Python libraries in an S3 bucket that will be loaded in your DevEndpoint.

createDevEndpointResponse_failureReason :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

The reason for a current failure in this DevEndpoint.

createDevEndpointResponse_glueVersion :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for running your ETL scripts on development endpoints.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

createDevEndpointResponse_numberOfNodes :: Lens' CreateDevEndpointResponse (Maybe Int) Source #

The number of Glue Data Processing Units (DPUs) allocated to this DevEndpoint.

createDevEndpointResponse_numberOfWorkers :: Lens' CreateDevEndpointResponse (Maybe Int) Source #

The number of workers of a defined workerType that are allocated to the development endpoint.

createDevEndpointResponse_roleArn :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the role assigned to the new DevEndpoint.

createDevEndpointResponse_securityConfiguration :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

The name of the SecurityConfiguration structure being used with this DevEndpoint.

createDevEndpointResponse_securityGroupIds :: Lens' CreateDevEndpointResponse (Maybe [Text]) Source #

The security groups assigned to the new DevEndpoint.

createDevEndpointResponse_status :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

The current status of the new DevEndpoint.

createDevEndpointResponse_subnetId :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

The subnet ID assigned to the new DevEndpoint.

createDevEndpointResponse_vpcId :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

The ID of the virtual private cloud (VPC) used by this DevEndpoint.

createDevEndpointResponse_workerType :: Lens' CreateDevEndpointResponse (Maybe WorkerType) Source #

The type of predefined worker that is allocated to the development endpoint. May be a value of Standard, G.1X, or G.2X.

createDevEndpointResponse_yarnEndpointAddress :: Lens' CreateDevEndpointResponse (Maybe Text) Source #

The address of the YARN endpoint used by this DevEndpoint.

createDevEndpointResponse_zeppelinRemoteSparkInterpreterPort :: Lens' CreateDevEndpointResponse (Maybe Int) Source #

The Apache Zeppelin port for the remote Apache Spark interpreter.

CreateJob

createJob_allocatedCapacity :: Lens' CreateJob (Maybe Int) Source #

This parameter is deprecated. Use MaxCapacity instead.

The number of Glue data processing units (DPUs) to allocate to this Job. You can allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

createJob_codeGenConfigurationNodes :: Lens' CreateJob (Maybe (HashMap Text CodeGenConfigurationNode)) Source #

The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.

createJob_connections :: Lens' CreateJob (Maybe ConnectionsList) Source #

The connections used for this job.

createJob_defaultArguments :: Lens' CreateJob (Maybe (HashMap Text Text)) Source #

The default arguments for this job.

You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes.

Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets from a Glue Connection, Secrets Manager or other secret management mechanism if you intend to keep them within the Job.

For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.

For information about the key-value pairs that Glue consumes to set up your job, see the Special Parameters Used by Glue topic in the developer guide.

createJob_description :: Lens' CreateJob (Maybe Text) Source #

Description of the job being defined.

createJob_executionClass :: Lens' CreateJob (Maybe ExecutionClass) Source #

Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.

The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.

Only jobs with Glue version 3.0 and above and command type glueetl will be allowed to set ExecutionClass to FLEX. The flexible execution class is available for Spark jobs.

createJob_executionProperty :: Lens' CreateJob (Maybe ExecutionProperty) Source #

An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.

createJob_glueVersion :: Lens' CreateJob (Maybe Text) Source #

Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for jobs of type Spark.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

Jobs that are created without specifying a Glue version default to Glue 0.9.

createJob_logUri :: Lens' CreateJob (Maybe Text) Source #

This field is reserved for future use.

createJob_maxCapacity :: Lens' CreateJob (Maybe Double) Source #

For Glue version 1.0 or earlier jobs, using the standard worker type, the number of Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

Do not set Max Capacity if using WorkerType and NumberOfWorkers.

The value that can be allocated for MaxCapacity depends on whether you are running a Python shell job or an Apache Spark ETL job:

  • When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
  • When you specify an Apache Spark ETL job (JobCommand.Name="glueetl") or Apache Spark streaming ETL job (JobCommand.Name="gluestreaming"), you can allocate a minimum of 2 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.

For Glue version 2.0 jobs, you cannot instead specify a Maximum capacity. Instead, you should specify a Worker type and the Number of workers.

createJob_maxRetries :: Lens' CreateJob (Maybe Int) Source #

The maximum number of times to retry this job if it fails.

createJob_nonOverridableArguments :: Lens' CreateJob (Maybe (HashMap Text Text)) Source #

Non-overridable arguments for this job, specified as name-value pairs.

createJob_notificationProperty :: Lens' CreateJob (Maybe NotificationProperty) Source #

Specifies configuration properties of a job notification.

createJob_numberOfWorkers :: Lens' CreateJob (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when a job runs.

createJob_securityConfiguration :: Lens' CreateJob (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with this job.

createJob_sourceControlDetails :: Lens' CreateJob (Maybe SourceControlDetails) Source #

The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.

createJob_tags :: Lens' CreateJob (Maybe (HashMap Text Text)) Source #

The tags to use with this job. You may use tags to limit access to the job. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

createJob_timeout :: Lens' CreateJob (Maybe Natural) Source #

The job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

createJob_workerType :: Lens' CreateJob (Maybe WorkerType) Source #

The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, G.2X, or G.025X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.025X worker type, each worker maps to 0.25 DPU (2 vCPU, 4 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.

createJob_name :: Lens' CreateJob Text Source #

The name you assign to this job definition. It must be unique in your account.

createJob_role :: Lens' CreateJob Text Source #

The name or Amazon Resource Name (ARN) of the IAM role associated with this job.

createJob_command :: Lens' CreateJob JobCommand Source #

The JobCommand that runs this job.

createJobResponse_name :: Lens' CreateJobResponse (Maybe Text) Source #

The unique name that was provided for this job definition.

createJobResponse_httpStatus :: Lens' CreateJobResponse Int Source #

The response's http status code.

CreateMLTransform

createMLTransform_description :: Lens' CreateMLTransform (Maybe Text) Source #

A description of the machine learning transform that is being defined. The default is an empty string.

createMLTransform_glueVersion :: Lens' CreateMLTransform (Maybe Text) Source #

This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.

createMLTransform_maxCapacity :: Lens' CreateMLTransform (Maybe Double) Source #

The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
  • MaxCapacity and NumberOfWorkers must both be at least 1.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

createMLTransform_maxRetries :: Lens' CreateMLTransform (Maybe Int) Source #

The maximum number of times to retry a task for this transform after a task run fails.

createMLTransform_numberOfWorkers :: Lens' CreateMLTransform (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when this task runs.

If WorkerType is set, then NumberOfWorkers is required (and vice versa).

createMLTransform_tags :: Lens' CreateMLTransform (Maybe (HashMap Text Text)) Source #

The tags to use with this machine learning transform. You may use tags to limit access to the machine learning transform. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

createMLTransform_timeout :: Lens' CreateMLTransform (Maybe Natural) Source #

The timeout of the task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

createMLTransform_transformEncryption :: Lens' CreateMLTransform (Maybe TransformEncryption) Source #

The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.

createMLTransform_workerType :: Lens' CreateMLTransform (Maybe WorkerType) Source #

The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
  • MaxCapacity and NumberOfWorkers must both be at least 1.

createMLTransform_name :: Lens' CreateMLTransform Text Source #

The unique name that you give the transform when you create it.

createMLTransform_inputRecordTables :: Lens' CreateMLTransform [GlueTable] Source #

A list of Glue table definitions used by the transform.

createMLTransform_parameters :: Lens' CreateMLTransform TransformParameters Source #

The algorithmic parameters that are specific to the transform type used. Conditionally dependent on the transform type.

createMLTransform_role :: Lens' CreateMLTransform Text Source #

The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both Glue service role permissions to Glue resources, and Amazon S3 permissions required by the transform.

  • This role needs Glue service role permissions to allow access to resources in Glue. See Attach a Policy to IAM Users That Access Glue.
  • This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform.

createMLTransformResponse_transformId :: Lens' CreateMLTransformResponse (Maybe Text) Source #

A unique identifier that is generated for the transform.

CreatePartition

createPartition_catalogId :: Lens' CreatePartition (Maybe Text) Source #

The Amazon Web Services account ID of the catalog in which the partition is to be created.

createPartition_databaseName :: Lens' CreatePartition Text Source #

The name of the metadata database in which the partition is to be created.

createPartition_tableName :: Lens' CreatePartition Text Source #

The name of the metadata table in which the partition is to be created.

createPartition_partitionInput :: Lens' CreatePartition PartitionInput Source #

A PartitionInput structure defining the partition to be created.

CreatePartitionIndex

createPartitionIndex_catalogId :: Lens' CreatePartitionIndex (Maybe Text) Source #

The catalog ID where the table resides.

createPartitionIndex_databaseName :: Lens' CreatePartitionIndex Text Source #

Specifies the name of a database in which you want to create a partition index.

createPartitionIndex_tableName :: Lens' CreatePartitionIndex Text Source #

Specifies the name of a table in which you want to create a partition index.

createPartitionIndex_partitionIndex :: Lens' CreatePartitionIndex PartitionIndex Source #

Specifies a PartitionIndex structure to create a partition index in an existing table.

CreateRegistry

createRegistry_description :: Lens' CreateRegistry (Maybe Text) Source #

A description of the registry. If description is not provided, there will not be any default value for this.

createRegistry_tags :: Lens' CreateRegistry (Maybe (HashMap Text Text)) Source #

Amazon Web Services tags that contain a key value pair and may be searched by console, command line, or API.

createRegistry_registryName :: Lens' CreateRegistry Text Source #

Name of the registry to be created of max length of 255, and may only contain letters, numbers, hyphen, underscore, dollar sign, or hash mark. No whitespace.

createRegistryResponse_registryArn :: Lens' CreateRegistryResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the newly created registry.

CreateSchema

createSchema_compatibility :: Lens' CreateSchema (Maybe Compatibility) Source #

The compatibility mode of the schema. The possible values are:

  • NONE: No compatibility mode applies. You can use this choice in development scenarios or if you do not know the compatibility mode that you want to apply to schemas. Any new version added will be accepted without undergoing a compatibility check.
  • DISABLED: This compatibility choice prevents versioning for a particular schema. You can use this choice to prevent future versioning of a schema.
  • BACKWARD: This compatibility choice is recommended as it allows data receivers to read both the current and one previous schema version. This means that for instance, a new schema version cannot drop data fields or change the type of these fields, so they can't be read by readers using the previous version.
  • BACKWARD_ALL: This compatibility choice allows data receivers to read both the current and all previous schema versions. You can use this choice when you need to delete fields or add optional fields, and check compatibility against all previous schema versions.
  • FORWARD: This compatibility choice allows data receivers to read both the current and one next schema version, but not necessarily later versions. You can use this choice when you need to add fields or delete optional fields, but only check compatibility against the last schema version.
  • FORWARD_ALL: This compatibility choice allows data receivers to read written by producers of any new registered schema. You can use this choice when you need to add fields or delete optional fields, and check compatibility against all previous schema versions.
  • FULL: This compatibility choice allows data receivers to read data written by producers using the previous or next version of the schema, but not necessarily earlier or later versions. You can use this choice when you need to add or remove optional fields, but only check compatibility against the last schema version.
  • FULL_ALL: This compatibility choice allows data receivers to read data written by producers using all previous schema versions. You can use this choice when you need to add or remove optional fields, and check compatibility against all previous schema versions.

createSchema_description :: Lens' CreateSchema (Maybe Text) Source #

An optional description of the schema. If description is not provided, there will not be any automatic default value for this.

createSchema_registryId :: Lens' CreateSchema (Maybe RegistryId) Source #

This is a wrapper shape to contain the registry identity fields. If this is not provided, the default registry will be used. The ARN format for the same will be: arn:aws:glue:us-east-2:<customer id>:registry/default-registry:random-5-letter-id.

createSchema_schemaDefinition :: Lens' CreateSchema (Maybe Text) Source #

The schema definition using the DataFormat setting for SchemaName.

createSchema_tags :: Lens' CreateSchema (Maybe (HashMap Text Text)) Source #

Amazon Web Services tags that contain a key value pair and may be searched by console, command line, or API. If specified, follows the Amazon Web Services tags-on-create pattern.

createSchema_schemaName :: Lens' CreateSchema Text Source #

Name of the schema to be created of max length of 255, and may only contain letters, numbers, hyphen, underscore, dollar sign, or hash mark. No whitespace.

createSchema_dataFormat :: Lens' CreateSchema DataFormat Source #

The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.

createSchemaResponse_dataFormat :: Lens' CreateSchemaResponse (Maybe DataFormat) Source #

The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.

createSchemaResponse_description :: Lens' CreateSchemaResponse (Maybe Text) Source #

A description of the schema if specified when created.

createSchemaResponse_latestSchemaVersion :: Lens' CreateSchemaResponse (Maybe Natural) Source #

The latest version of the schema associated with the returned schema definition.

createSchemaResponse_nextSchemaVersion :: Lens' CreateSchemaResponse (Maybe Natural) Source #

The next version of the schema associated with the returned schema definition.

createSchemaResponse_registryArn :: Lens' CreateSchemaResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the registry.

createSchemaResponse_schemaArn :: Lens' CreateSchemaResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the schema.

createSchemaResponse_schemaCheckpoint :: Lens' CreateSchemaResponse (Maybe Natural) Source #

The version number of the checkpoint (the last time the compatibility mode was changed).

createSchemaResponse_schemaVersionId :: Lens' CreateSchemaResponse (Maybe Text) Source #

The unique identifier of the first schema version.

CreateScript

createScript_dagEdges :: Lens' CreateScript (Maybe [CodeGenEdge]) Source #

A list of the edges in the DAG.

createScript_dagNodes :: Lens' CreateScript (Maybe [CodeGenNode]) Source #

A list of the nodes in the DAG.

createScript_language :: Lens' CreateScript (Maybe Language) Source #

The programming language of the resulting code from the DAG.

createScriptResponse_pythonScript :: Lens' CreateScriptResponse (Maybe Text) Source #

The Python script generated from the DAG.

createScriptResponse_scalaCode :: Lens' CreateScriptResponse (Maybe Text) Source #

The Scala code generated from the DAG.

CreateSecurityConfiguration

createSecurityConfiguration_name :: Lens' CreateSecurityConfiguration Text Source #

The name for the new security configuration.

CreateSession

createSession_connections :: Lens' CreateSession (Maybe ConnectionsList) Source #

The number of connections to use for the session.

createSession_defaultArguments :: Lens' CreateSession (Maybe (HashMap Text Text)) Source #

A map array of key-value pairs. Max is 75 pairs.

createSession_description :: Lens' CreateSession (Maybe Text) Source #

The description of the session.

createSession_glueVersion :: Lens' CreateSession (Maybe Text) Source #

The Glue version determines the versions of Apache Spark and Python that Glue supports. The GlueVersion must be greater than 2.0.

createSession_idleTimeout :: Lens' CreateSession (Maybe Natural) Source #

The number of seconds when idle before request times out.

createSession_maxCapacity :: Lens' CreateSession (Maybe Double) Source #

The number of Glue data processing units (DPUs) that can be allocated when the job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB memory.

createSession_numberOfWorkers :: Lens' CreateSession (Maybe Int) Source #

The number of workers of a defined WorkerType to use for the session.

createSession_securityConfiguration :: Lens' CreateSession (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with the session

createSession_tags :: Lens' CreateSession (Maybe (HashMap Text Text)) Source #

The map of key value pairs (tags) belonging to the session.

createSession_timeout :: Lens' CreateSession (Maybe Natural) Source #

The number of seconds before request times out.

createSession_workerType :: Lens' CreateSession (Maybe WorkerType) Source #

The type of predefined worker that is allocated to use for the session. Accepts a value of Standard, G.1X, G.2X, or G.025X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.025X worker type, each worker maps to 0.25 DPU (2 vCPU, 4 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.

createSession_id :: Lens' CreateSession Text Source #

The ID of the session request.

createSession_command :: Lens' CreateSession SessionCommand Source #

The SessionCommand that runs the job.

createSessionResponse_session :: Lens' CreateSessionResponse (Maybe Session) Source #

Returns the session object in the response.

CreateTable

createTable_catalogId :: Lens' CreateTable (Maybe Text) Source #

The ID of the Data Catalog in which to create the Table. If none is supplied, the Amazon Web Services account ID is used by default.

createTable_partitionIndexes :: Lens' CreateTable (Maybe [PartitionIndex]) Source #

A list of partition indexes, PartitionIndex structures, to create in the table.

createTable_databaseName :: Lens' CreateTable Text Source #

The catalog database in which to create the new table. For Hive compatibility, this name is entirely lowercase.

createTable_tableInput :: Lens' CreateTable TableInput Source #

The TableInput object that defines the metadata table to create in the catalog.

CreateTrigger

createTrigger_description :: Lens' CreateTrigger (Maybe Text) Source #

A description of the new trigger.

createTrigger_eventBatchingCondition :: Lens' CreateTrigger (Maybe EventBatchingCondition) Source #

Batch condition that must be met (specified number of events received or batch time window expired) before EventBridge event trigger fires.

createTrigger_predicate :: Lens' CreateTrigger (Maybe Predicate) Source #

A predicate to specify when the new trigger should fire.

This field is required when the trigger type is CONDITIONAL.

createTrigger_schedule :: Lens' CreateTrigger (Maybe Text) Source #

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

This field is required when the trigger type is SCHEDULED.

createTrigger_startOnCreation :: Lens' CreateTrigger (Maybe Bool) Source #

Set to true to start SCHEDULED and CONDITIONAL triggers when created. True is not supported for ON_DEMAND triggers.

createTrigger_tags :: Lens' CreateTrigger (Maybe (HashMap Text Text)) Source #

The tags to use with this trigger. You may use tags to limit access to the trigger. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

createTrigger_workflowName :: Lens' CreateTrigger (Maybe Text) Source #

The name of the workflow associated with the trigger.

createTrigger_name :: Lens' CreateTrigger Text Source #

The name of the trigger.

createTrigger_type :: Lens' CreateTrigger TriggerType Source #

The type of the new trigger.

createTrigger_actions :: Lens' CreateTrigger [Action] Source #

The actions initiated by this trigger when it fires.

CreateUserDefinedFunction

createUserDefinedFunction_catalogId :: Lens' CreateUserDefinedFunction (Maybe Text) Source #

The ID of the Data Catalog in which to create the function. If none is provided, the Amazon Web Services account ID is used by default.

createUserDefinedFunction_databaseName :: Lens' CreateUserDefinedFunction Text Source #

The name of the catalog database in which to create the function.

createUserDefinedFunction_functionInput :: Lens' CreateUserDefinedFunction UserDefinedFunctionInput Source #

A FunctionInput object that defines the function to create in the Data Catalog.

CreateWorkflow

createWorkflow_defaultRunProperties :: Lens' CreateWorkflow (Maybe (HashMap Text Text)) Source #

A collection of properties to be used as part of each execution of the workflow.

createWorkflow_description :: Lens' CreateWorkflow (Maybe Text) Source #

A description of the workflow.

createWorkflow_maxConcurrentRuns :: Lens' CreateWorkflow (Maybe Int) Source #

You can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs.

createWorkflow_tags :: Lens' CreateWorkflow (Maybe (HashMap Text Text)) Source #

The tags to be used with this workflow.

createWorkflow_name :: Lens' CreateWorkflow Text Source #

The name to be assigned to the workflow. It should be unique within your account.

createWorkflowResponse_name :: Lens' CreateWorkflowResponse (Maybe Text) Source #

The name of the workflow which was provided as part of the request.

DeleteBlueprint

deleteBlueprint_name :: Lens' DeleteBlueprint Text Source #

The name of the blueprint to delete.

deleteBlueprintResponse_name :: Lens' DeleteBlueprintResponse (Maybe Text) Source #

Returns the name of the blueprint that was deleted.

DeleteClassifier

deleteClassifier_name :: Lens' DeleteClassifier Text Source #

Name of the classifier to remove.

DeleteColumnStatisticsForPartition

deleteColumnStatisticsForPartition_catalogId :: Lens' DeleteColumnStatisticsForPartition (Maybe Text) Source #

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

deleteColumnStatisticsForPartition_databaseName :: Lens' DeleteColumnStatisticsForPartition Text Source #

The name of the catalog database where the partitions reside.

DeleteColumnStatisticsForTable

deleteColumnStatisticsForTable_catalogId :: Lens' DeleteColumnStatisticsForTable (Maybe Text) Source #

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

deleteColumnStatisticsForTable_databaseName :: Lens' DeleteColumnStatisticsForTable Text Source #

The name of the catalog database where the partitions reside.

DeleteConnection

deleteConnection_catalogId :: Lens' DeleteConnection (Maybe Text) Source #

The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services account ID is used by default.

deleteConnection_connectionName :: Lens' DeleteConnection Text Source #

The name of the connection to delete.

DeleteCrawler

deleteCrawler_name :: Lens' DeleteCrawler Text Source #

The name of the crawler to remove.

DeleteCustomEntityType

deleteCustomEntityType_name :: Lens' DeleteCustomEntityType Text Source #

The name of the custom pattern that you want to delete.

DeleteDataQualityRuleset

DeleteDatabase

deleteDatabase_catalogId :: Lens' DeleteDatabase (Maybe Text) Source #

The ID of the Data Catalog in which the database resides. If none is provided, the Amazon Web Services account ID is used by default.

deleteDatabase_name :: Lens' DeleteDatabase Text Source #

The name of the database to delete. For Hive compatibility, this must be all lowercase.

DeleteDevEndpoint

DeleteJob

deleteJob_jobName :: Lens' DeleteJob Text Source #

The name of the job definition to delete.

deleteJobResponse_jobName :: Lens' DeleteJobResponse (Maybe Text) Source #

The name of the job definition that was deleted.

deleteJobResponse_httpStatus :: Lens' DeleteJobResponse Int Source #

The response's http status code.

DeleteMLTransform

deleteMLTransform_transformId :: Lens' DeleteMLTransform Text Source #

The unique identifier of the transform to delete.

deleteMLTransformResponse_transformId :: Lens' DeleteMLTransformResponse (Maybe Text) Source #

The unique identifier of the transform that was deleted.

DeletePartition

deletePartition_catalogId :: Lens' DeletePartition (Maybe Text) Source #

The ID of the Data Catalog where the partition to be deleted resides. If none is provided, the Amazon Web Services account ID is used by default.

deletePartition_databaseName :: Lens' DeletePartition Text Source #

The name of the catalog database in which the table in question resides.

deletePartition_tableName :: Lens' DeletePartition Text Source #

The name of the table that contains the partition to be deleted.

deletePartition_partitionValues :: Lens' DeletePartition [Text] Source #

The values that define the partition.

DeletePartitionIndex

deletePartitionIndex_catalogId :: Lens' DeletePartitionIndex (Maybe Text) Source #

The catalog ID where the table resides.

deletePartitionIndex_databaseName :: Lens' DeletePartitionIndex Text Source #

Specifies the name of a database from which you want to delete a partition index.

deletePartitionIndex_tableName :: Lens' DeletePartitionIndex Text Source #

Specifies the name of a table from which you want to delete a partition index.

deletePartitionIndex_indexName :: Lens' DeletePartitionIndex Text Source #

The name of the partition index to be deleted.

DeleteRegistry

deleteRegistry_registryId :: Lens' DeleteRegistry RegistryId Source #

This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).

deleteRegistryResponse_registryArn :: Lens' DeleteRegistryResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the registry being deleted.

deleteRegistryResponse_status :: Lens' DeleteRegistryResponse (Maybe RegistryStatus) Source #

The status of the registry. A successful operation will return the Deleting status.

DeleteResourcePolicy

deleteResourcePolicy_policyHashCondition :: Lens' DeleteResourcePolicy (Maybe Text) Source #

The hash value returned when this policy was set.

deleteResourcePolicy_resourceArn :: Lens' DeleteResourcePolicy (Maybe Text) Source #

The ARN of the Glue resource for the resource policy to be deleted.

DeleteSchema

deleteSchema_schemaId :: Lens' DeleteSchema SchemaId Source #

This is a wrapper structure that may contain the schema name and Amazon Resource Name (ARN).

deleteSchemaResponse_schemaArn :: Lens' DeleteSchemaResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the schema being deleted.

deleteSchemaResponse_schemaName :: Lens' DeleteSchemaResponse (Maybe Text) Source #

The name of the schema being deleted.

DeleteSchemaVersions

deleteSchemaVersions_schemaId :: Lens' DeleteSchemaVersions SchemaId Source #

This is a wrapper structure that may contain the schema name and Amazon Resource Name (ARN).

deleteSchemaVersions_versions :: Lens' DeleteSchemaVersions Text Source #

A version range may be supplied which may be of the format:

  • a single version number, 5
  • a range, 5-8 : deletes versions 5, 6, 7, 8

deleteSchemaVersionsResponse_schemaVersionErrors :: Lens' DeleteSchemaVersionsResponse (Maybe [SchemaVersionErrorItem]) Source #

A list of SchemaVersionErrorItem objects, each containing an error and schema version.

DeleteSecurityConfiguration

deleteSecurityConfiguration_name :: Lens' DeleteSecurityConfiguration Text Source #

The name of the security configuration to delete.

DeleteSession

deleteSession_requestOrigin :: Lens' DeleteSession (Maybe Text) Source #

The name of the origin of the delete session request.

deleteSession_id :: Lens' DeleteSession Text Source #

The ID of the session to be deleted.

deleteSessionResponse_id :: Lens' DeleteSessionResponse (Maybe Text) Source #

Returns the ID of the deleted session.

DeleteTable

deleteTable_catalogId :: Lens' DeleteTable (Maybe Text) Source #

The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default.

deleteTable_transactionId :: Lens' DeleteTable (Maybe Text) Source #

The transaction ID at which to delete the table contents.

deleteTable_databaseName :: Lens' DeleteTable Text Source #

The name of the catalog database in which the table resides. For Hive compatibility, this name is entirely lowercase.

deleteTable_name :: Lens' DeleteTable Text Source #

The name of the table to be deleted. For Hive compatibility, this name is entirely lowercase.

DeleteTableVersion

deleteTableVersion_catalogId :: Lens' DeleteTableVersion (Maybe Text) Source #

The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.

deleteTableVersion_databaseName :: Lens' DeleteTableVersion Text Source #

The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.

deleteTableVersion_tableName :: Lens' DeleteTableVersion Text Source #

The name of the table. For Hive compatibility, this name is entirely lowercase.

deleteTableVersion_versionId :: Lens' DeleteTableVersion Text Source #

The ID of the table version to be deleted. A VersionID is a string representation of an integer. Each version is incremented by 1.

DeleteTrigger

deleteTrigger_name :: Lens' DeleteTrigger Text Source #

The name of the trigger to delete.

deleteTriggerResponse_name :: Lens' DeleteTriggerResponse (Maybe Text) Source #

The name of the trigger that was deleted.

DeleteUserDefinedFunction

deleteUserDefinedFunction_catalogId :: Lens' DeleteUserDefinedFunction (Maybe Text) Source #

The ID of the Data Catalog where the function to be deleted is located. If none is supplied, the Amazon Web Services account ID is used by default.

deleteUserDefinedFunction_databaseName :: Lens' DeleteUserDefinedFunction Text Source #

The name of the catalog database where the function is located.

deleteUserDefinedFunction_functionName :: Lens' DeleteUserDefinedFunction Text Source #

The name of the function definition to be deleted.

DeleteWorkflow

deleteWorkflow_name :: Lens' DeleteWorkflow Text Source #

Name of the workflow to be deleted.

deleteWorkflowResponse_name :: Lens' DeleteWorkflowResponse (Maybe Text) Source #

Name of the workflow specified in input.

GetBlueprint

getBlueprint_includeBlueprint :: Lens' GetBlueprint (Maybe Bool) Source #

Specifies whether or not to include the blueprint in the response.

getBlueprint_includeParameterSpec :: Lens' GetBlueprint (Maybe Bool) Source #

Specifies whether or not to include the parameter specification.

getBlueprint_name :: Lens' GetBlueprint Text Source #

The name of the blueprint.

GetBlueprintRun

getBlueprintRun_runId :: Lens' GetBlueprintRun Text Source #

The run ID for the blueprint run you want to retrieve.

GetBlueprintRuns

getBlueprintRuns_maxResults :: Lens' GetBlueprintRuns (Maybe Natural) Source #

The maximum size of a list to return.

getBlueprintRuns_nextToken :: Lens' GetBlueprintRuns (Maybe Text) Source #

A continuation token, if this is a continuation request.

getBlueprintRunsResponse_nextToken :: Lens' GetBlueprintRunsResponse (Maybe Text) Source #

A continuation token, if not all blueprint runs have been returned.

GetCatalogImportStatus

getCatalogImportStatus_catalogId :: Lens' GetCatalogImportStatus (Maybe Text) Source #

The ID of the catalog to migrate. Currently, this should be the Amazon Web Services account ID.

GetClassifier

getClassifier_name :: Lens' GetClassifier Text Source #

Name of the classifier to retrieve.

GetClassifiers

getClassifiers_maxResults :: Lens' GetClassifiers (Maybe Natural) Source #

The size of the list to return (optional).

getClassifiers_nextToken :: Lens' GetClassifiers (Maybe Text) Source #

An optional continuation token.

GetColumnStatisticsForPartition

getColumnStatisticsForPartition_catalogId :: Lens' GetColumnStatisticsForPartition (Maybe Text) Source #

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

getColumnStatisticsForPartition_databaseName :: Lens' GetColumnStatisticsForPartition Text Source #

The name of the catalog database where the partitions reside.

getColumnStatisticsForPartition_partitionValues :: Lens' GetColumnStatisticsForPartition [Text] Source #

A list of partition values identifying the partition.

GetColumnStatisticsForTable

getColumnStatisticsForTable_catalogId :: Lens' GetColumnStatisticsForTable (Maybe Text) Source #

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

getColumnStatisticsForTable_databaseName :: Lens' GetColumnStatisticsForTable Text Source #

The name of the catalog database where the partitions reside.

GetConnection

getConnection_catalogId :: Lens' GetConnection (Maybe Text) Source #

The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services account ID is used by default.

getConnection_hidePassword :: Lens' GetConnection (Maybe Bool) Source #

Allows you to retrieve the connection metadata without returning the password. For instance, the Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the KMS key to decrypt the password, but it does have permission to access the rest of the connection properties.

getConnection_name :: Lens' GetConnection Text Source #

The name of the connection definition to retrieve.

GetConnections

getConnections_catalogId :: Lens' GetConnections (Maybe Text) Source #

The ID of the Data Catalog in which the connections reside. If none is provided, the Amazon Web Services account ID is used by default.

getConnections_filter :: Lens' GetConnections (Maybe GetConnectionsFilter) Source #

A filter that controls which connections are returned.

getConnections_hidePassword :: Lens' GetConnections (Maybe Bool) Source #

Allows you to retrieve the connection metadata without returning the password. For instance, the Glue console uses this flag to retrieve the connection, and does not display the password. Set this parameter when the caller might not have permission to use the KMS key to decrypt the password, but it does have permission to access the rest of the connection properties.

getConnections_maxResults :: Lens' GetConnections (Maybe Natural) Source #

The maximum number of connections to return in one response.

getConnections_nextToken :: Lens' GetConnections (Maybe Text) Source #

A continuation token, if this is a continuation call.

getConnectionsResponse_nextToken :: Lens' GetConnectionsResponse (Maybe Text) Source #

A continuation token, if the list of connections returned does not include the last of the filtered connections.

GetCrawler

getCrawler_name :: Lens' GetCrawler Text Source #

The name of the crawler to retrieve metadata for.

getCrawlerResponse_crawler :: Lens' GetCrawlerResponse (Maybe Crawler) Source #

The metadata for the specified crawler.

GetCrawlerMetrics

getCrawlerMetrics_crawlerNameList :: Lens' GetCrawlerMetrics (Maybe [Text]) Source #

A list of the names of crawlers about which to retrieve metrics.

getCrawlerMetrics_maxResults :: Lens' GetCrawlerMetrics (Maybe Natural) Source #

The maximum size of a list to return.

getCrawlerMetrics_nextToken :: Lens' GetCrawlerMetrics (Maybe Text) Source #

A continuation token, if this is a continuation call.

getCrawlerMetricsResponse_nextToken :: Lens' GetCrawlerMetricsResponse (Maybe Text) Source #

A continuation token, if the returned list does not contain the last metric available.

GetCrawlers

getCrawlers_maxResults :: Lens' GetCrawlers (Maybe Natural) Source #

The number of crawlers to return on each call.

getCrawlers_nextToken :: Lens' GetCrawlers (Maybe Text) Source #

A continuation token, if this is a continuation request.

getCrawlersResponse_nextToken :: Lens' GetCrawlersResponse (Maybe Text) Source #

A continuation token, if the returned list has not reached the end of those defined in this customer account.

GetCustomEntityType

getCustomEntityType_name :: Lens' GetCustomEntityType Text Source #

The name of the custom pattern that you want to retrieve.

getCustomEntityTypeResponse_contextWords :: Lens' GetCustomEntityTypeResponse (Maybe (NonEmpty Text)) Source #

A list of context words if specified when you created the custom pattern. If none of these context words are found within the vicinity of the regular expression the data will not be detected as sensitive data.

getCustomEntityTypeResponse_name :: Lens' GetCustomEntityTypeResponse (Maybe Text) Source #

The name of the custom pattern that you retrieved.

getCustomEntityTypeResponse_regexString :: Lens' GetCustomEntityTypeResponse (Maybe Text) Source #

A regular expression string that is used for detecting sensitive data in a custom pattern.

GetDataCatalogEncryptionSettings

getDataCatalogEncryptionSettings_catalogId :: Lens' GetDataCatalogEncryptionSettings (Maybe Text) Source #

The ID of the Data Catalog to retrieve the security configuration for. If none is provided, the Amazon Web Services account ID is used by default.

GetDataQualityResult

getDataQualityResult_resultId :: Lens' GetDataQualityResult Text Source #

A unique result ID for the data quality result.

getDataQualityResultResponse_completedOn :: Lens' GetDataQualityResultResponse (Maybe UTCTime) Source #

The date and time when the run for this data quality result was completed.

getDataQualityResultResponse_dataSource :: Lens' GetDataQualityResultResponse (Maybe DataSource) Source #

The table associated with the data quality result, if any.

getDataQualityResultResponse_evaluationContext :: Lens' GetDataQualityResultResponse (Maybe Text) Source #

In the context of a job in Glue Studio, each node in the canvas is typically assigned some sort of name and data quality nodes will have names. In the case of multiple nodes, the evaluationContext can differentiate the nodes.

getDataQualityResultResponse_jobName :: Lens' GetDataQualityResultResponse (Maybe Text) Source #

The job name associated with the data quality result, if any.

getDataQualityResultResponse_jobRunId :: Lens' GetDataQualityResultResponse (Maybe Text) Source #

The job run ID associated with the data quality result, if any.

getDataQualityResultResponse_resultId :: Lens' GetDataQualityResultResponse (Maybe Text) Source #

A unique result ID for the data quality result.

getDataQualityResultResponse_ruleResults :: Lens' GetDataQualityResultResponse (Maybe (NonEmpty DataQualityRuleResult)) Source #

A list of DataQualityRuleResult objects representing the results for each rule.

getDataQualityResultResponse_rulesetEvaluationRunId :: Lens' GetDataQualityResultResponse (Maybe Text) Source #

The unique run ID associated with the ruleset evaluation.

getDataQualityResultResponse_rulesetName :: Lens' GetDataQualityResultResponse (Maybe Text) Source #

The name of the ruleset associated with the data quality result.

getDataQualityResultResponse_score :: Lens' GetDataQualityResultResponse (Maybe Double) Source #

An aggregate data quality score. Represents the ratio of rules that passed to the total number of rules.

getDataQualityResultResponse_startedOn :: Lens' GetDataQualityResultResponse (Maybe UTCTime) Source #

The date and time when the run for this data quality result started.

GetDataQualityRuleRecommendationRun

getDataQualityRuleRecommendationRunResponse_lastModifiedOn :: Lens' GetDataQualityRuleRecommendationRunResponse (Maybe UTCTime) Source #

A timestamp. The last point in time when this data quality rule recommendation run was modified.

getDataQualityRuleRecommendationRunResponse_recommendedRuleset :: Lens' GetDataQualityRuleRecommendationRunResponse (Maybe Text) Source #

When a start rule recommendation run completes, it creates a recommended ruleset (a set of rules). This member has those rules in Data Quality Definition Language (DQDL) format.

getDataQualityRuleRecommendationRunResponse_timeout :: Lens' GetDataQualityRuleRecommendationRunResponse (Maybe Natural) Source #

The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

GetDataQualityRuleset

getDataQualityRulesetResponse_createdOn :: Lens' GetDataQualityRulesetResponse (Maybe UTCTime) Source #

A timestamp. The time and date that this data quality ruleset was created.

getDataQualityRulesetResponse_lastModifiedOn :: Lens' GetDataQualityRulesetResponse (Maybe UTCTime) Source #

A timestamp. The last point in time when this data quality ruleset was modified.

getDataQualityRulesetResponse_recommendationRunId :: Lens' GetDataQualityRulesetResponse (Maybe Text) Source #

When a ruleset was created from a recommendation run, this run ID is generated to link the two together.

getDataQualityRulesetResponse_ruleset :: Lens' GetDataQualityRulesetResponse (Maybe Text) Source #

A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.

GetDataQualityRulesetEvaluationRun

getDataQualityRulesetEvaluationRunResponse_lastModifiedOn :: Lens' GetDataQualityRulesetEvaluationRunResponse (Maybe UTCTime) Source #

A timestamp. The last point in time when this data quality rule recommendation run was modified.

getDataQualityRulesetEvaluationRunResponse_timeout :: Lens' GetDataQualityRulesetEvaluationRunResponse (Maybe Natural) Source #

The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

GetDatabase

getDatabase_catalogId :: Lens' GetDatabase (Maybe Text) Source #

The ID of the Data Catalog in which the database resides. If none is provided, the Amazon Web Services account ID is used by default.

getDatabase_name :: Lens' GetDatabase Text Source #

The name of the database to retrieve. For Hive compatibility, this should be all lowercase.

getDatabaseResponse_database :: Lens' GetDatabaseResponse (Maybe Database) Source #

The definition of the specified database in the Data Catalog.

GetDatabases

getDatabases_catalogId :: Lens' GetDatabases (Maybe Text) Source #

The ID of the Data Catalog from which to retrieve Databases. If none is provided, the Amazon Web Services account ID is used by default.

getDatabases_maxResults :: Lens' GetDatabases (Maybe Natural) Source #

The maximum number of databases to return in one response.

getDatabases_nextToken :: Lens' GetDatabases (Maybe Text) Source #

A continuation token, if this is a continuation call.

getDatabases_resourceShareType :: Lens' GetDatabases (Maybe ResourceShareType) Source #

Allows you to specify that you want to list the databases shared with your account. The allowable values are FOREIGN or ALL.

  • If set to FOREIGN, will list the databases shared with your account.
  • If set to ALL, will list the databases shared with your account, as well as the databases in yor local account.

getDatabasesResponse_nextToken :: Lens' GetDatabasesResponse (Maybe Text) Source #

A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.

getDatabasesResponse_databaseList :: Lens' GetDatabasesResponse [Database] Source #

A list of Database objects from the specified catalog.

GetDataflowGraph

GetDevEndpoint

getDevEndpoint_endpointName :: Lens' GetDevEndpoint Text Source #

Name of the DevEndpoint to retrieve information for.

GetDevEndpoints

getDevEndpoints_maxResults :: Lens' GetDevEndpoints (Maybe Natural) Source #

The maximum size of information to return.

getDevEndpoints_nextToken :: Lens' GetDevEndpoints (Maybe Text) Source #

A continuation token, if this is a continuation call.

getDevEndpointsResponse_nextToken :: Lens' GetDevEndpointsResponse (Maybe Text) Source #

A continuation token, if not all DevEndpoint definitions have yet been returned.

GetJob

getJob_jobName :: Lens' GetJob Text Source #

The name of the job definition to retrieve.

getJobResponse_job :: Lens' GetJobResponse (Maybe Job) Source #

The requested job definition.

getJobResponse_httpStatus :: Lens' GetJobResponse Int Source #

The response's http status code.

GetJobBookmark

getJobBookmark_runId :: Lens' GetJobBookmark (Maybe Text) Source #

The unique run identifier associated with this job run.

getJobBookmark_jobName :: Lens' GetJobBookmark Text Source #

The name of the job in question.

getJobBookmarkResponse_jobBookmarkEntry :: Lens' GetJobBookmarkResponse (Maybe JobBookmarkEntry) Source #

A structure that defines a point that a job can resume processing.

GetJobRun

getJobRun_predecessorsIncluded :: Lens' GetJobRun (Maybe Bool) Source #

True if a list of predecessor runs should be returned.

getJobRun_jobName :: Lens' GetJobRun Text Source #

Name of the job definition being run.

getJobRun_runId :: Lens' GetJobRun Text Source #

The ID of the job run.

getJobRunResponse_httpStatus :: Lens' GetJobRunResponse Int Source #

The response's http status code.

GetJobRuns

getJobRuns_maxResults :: Lens' GetJobRuns (Maybe Natural) Source #

The maximum size of the response.

getJobRuns_nextToken :: Lens' GetJobRuns (Maybe Text) Source #

A continuation token, if this is a continuation call.

getJobRuns_jobName :: Lens' GetJobRuns Text Source #

The name of the job definition for which to retrieve all job runs.

getJobRunsResponse_jobRuns :: Lens' GetJobRunsResponse (Maybe [JobRun]) Source #

A list of job-run metadata objects.

getJobRunsResponse_nextToken :: Lens' GetJobRunsResponse (Maybe Text) Source #

A continuation token, if not all requested job runs have been returned.

GetJobs

getJobs_maxResults :: Lens' GetJobs (Maybe Natural) Source #

The maximum size of the response.

getJobs_nextToken :: Lens' GetJobs (Maybe Text) Source #

A continuation token, if this is a continuation call.

getJobsResponse_jobs :: Lens' GetJobsResponse (Maybe [Job]) Source #

A list of job definitions.

getJobsResponse_nextToken :: Lens' GetJobsResponse (Maybe Text) Source #

A continuation token, if not all job definitions have yet been returned.

getJobsResponse_httpStatus :: Lens' GetJobsResponse Int Source #

The response's http status code.

GetMLTaskRun

getMLTaskRun_transformId :: Lens' GetMLTaskRun Text Source #

The unique identifier of the machine learning transform.

getMLTaskRun_taskRunId :: Lens' GetMLTaskRun Text Source #

The unique identifier of the task run.

getMLTaskRunResponse_completedOn :: Lens' GetMLTaskRunResponse (Maybe UTCTime) Source #

The date and time when this task run was completed.

getMLTaskRunResponse_errorString :: Lens' GetMLTaskRunResponse (Maybe Text) Source #

The error strings that are associated with the task run.

getMLTaskRunResponse_executionTime :: Lens' GetMLTaskRunResponse (Maybe Int) Source #

The amount of time (in seconds) that the task run consumed resources.

getMLTaskRunResponse_lastModifiedOn :: Lens' GetMLTaskRunResponse (Maybe UTCTime) Source #

The date and time when this task run was last modified.

getMLTaskRunResponse_logGroupName :: Lens' GetMLTaskRunResponse (Maybe Text) Source #

The names of the log groups that are associated with the task run.

getMLTaskRunResponse_properties :: Lens' GetMLTaskRunResponse (Maybe TaskRunProperties) Source #

The list of properties that are associated with the task run.

getMLTaskRunResponse_startedOn :: Lens' GetMLTaskRunResponse (Maybe UTCTime) Source #

The date and time when this task run started.

getMLTaskRunResponse_taskRunId :: Lens' GetMLTaskRunResponse (Maybe Text) Source #

The unique run identifier associated with this run.

getMLTaskRunResponse_transformId :: Lens' GetMLTaskRunResponse (Maybe Text) Source #

The unique identifier of the task run.

GetMLTaskRuns

getMLTaskRuns_filter :: Lens' GetMLTaskRuns (Maybe TaskRunFilterCriteria) Source #

The filter criteria, in the TaskRunFilterCriteria structure, for the task run.

getMLTaskRuns_maxResults :: Lens' GetMLTaskRuns (Maybe Natural) Source #

The maximum number of results to return.

getMLTaskRuns_nextToken :: Lens' GetMLTaskRuns (Maybe Text) Source #

A token for pagination of the results. The default is empty.

getMLTaskRuns_sort :: Lens' GetMLTaskRuns (Maybe TaskRunSortCriteria) Source #

The sorting criteria, in the TaskRunSortCriteria structure, for the task run.

getMLTaskRuns_transformId :: Lens' GetMLTaskRuns Text Source #

The unique identifier of the machine learning transform.

getMLTaskRunsResponse_nextToken :: Lens' GetMLTaskRunsResponse (Maybe Text) Source #

A pagination token, if more results are available.

getMLTaskRunsResponse_taskRuns :: Lens' GetMLTaskRunsResponse (Maybe [TaskRun]) Source #

A list of task runs that are associated with the transform.

GetMLTransform

getMLTransform_transformId :: Lens' GetMLTransform Text Source #

The unique identifier of the transform, generated at the time that the transform was created.

getMLTransformResponse_createdOn :: Lens' GetMLTransformResponse (Maybe UTCTime) Source #

The date and time when the transform was created.

getMLTransformResponse_glueVersion :: Lens' GetMLTransformResponse (Maybe Text) Source #

This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.

getMLTransformResponse_inputRecordTables :: Lens' GetMLTransformResponse (Maybe [GlueTable]) Source #

A list of Glue table definitions used by the transform.

getMLTransformResponse_labelCount :: Lens' GetMLTransformResponse (Maybe Int) Source #

The number of labels available for this transform.

getMLTransformResponse_lastModifiedOn :: Lens' GetMLTransformResponse (Maybe UTCTime) Source #

The date and time when the transform was last modified.

getMLTransformResponse_maxCapacity :: Lens' GetMLTransformResponse (Maybe Double) Source #

The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

getMLTransformResponse_maxRetries :: Lens' GetMLTransformResponse (Maybe Int) Source #

The maximum number of times to retry a task for this transform after a task run fails.

getMLTransformResponse_name :: Lens' GetMLTransformResponse (Maybe Text) Source #

The unique name given to the transform when it was created.

getMLTransformResponse_numberOfWorkers :: Lens' GetMLTransformResponse (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when this task runs.

getMLTransformResponse_parameters :: Lens' GetMLTransformResponse (Maybe TransformParameters) Source #

The configuration parameters that are specific to the algorithm used.

getMLTransformResponse_role :: Lens' GetMLTransformResponse (Maybe Text) Source #

The name or Amazon Resource Name (ARN) of the IAM role with the required permissions.

getMLTransformResponse_schema :: Lens' GetMLTransformResponse (Maybe [SchemaColumn]) Source #

The Map<Column, Type> object that represents the schema that this transform accepts. Has an upper bound of 100 columns.

getMLTransformResponse_status :: Lens' GetMLTransformResponse (Maybe TransformStatusType) Source #

The last known status of the transform (to indicate whether it can be used or not). One of "NOT_READY", "READY", or "DELETING".

getMLTransformResponse_timeout :: Lens' GetMLTransformResponse (Maybe Natural) Source #

The timeout for a task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

getMLTransformResponse_transformEncryption :: Lens' GetMLTransformResponse (Maybe TransformEncryption) Source #

The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.

getMLTransformResponse_transformId :: Lens' GetMLTransformResponse (Maybe Text) Source #

The unique identifier of the transform, generated at the time that the transform was created.

getMLTransformResponse_workerType :: Lens' GetMLTransformResponse (Maybe WorkerType) Source #

The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

GetMLTransforms

getMLTransforms_maxResults :: Lens' GetMLTransforms (Maybe Natural) Source #

The maximum number of results to return.

getMLTransforms_nextToken :: Lens' GetMLTransforms (Maybe Text) Source #

A paginated token to offset the results.

getMLTransformsResponse_nextToken :: Lens' GetMLTransformsResponse (Maybe Text) Source #

A pagination token, if more results are available.

GetMapping

getMapping_location :: Lens' GetMapping (Maybe Location) Source #

Parameters for the mapping.

getMapping_sinks :: Lens' GetMapping (Maybe [CatalogEntry]) Source #

A list of target tables.

getMapping_source :: Lens' GetMapping CatalogEntry Source #

Specifies the source table.

getMappingResponse_mapping :: Lens' GetMappingResponse [MappingEntry] Source #

A list of mappings to the specified targets.

GetPartition

getPartition_catalogId :: Lens' GetPartition (Maybe Text) Source #

The ID of the Data Catalog where the partition in question resides. If none is provided, the Amazon Web Services account ID is used by default.

getPartition_databaseName :: Lens' GetPartition Text Source #

The name of the catalog database where the partition resides.

getPartition_tableName :: Lens' GetPartition Text Source #

The name of the partition's table.

getPartition_partitionValues :: Lens' GetPartition [Text] Source #

The values that define the partition.

getPartitionResponse_partition :: Lens' GetPartitionResponse (Maybe Partition) Source #

The requested information, in the form of a Partition object.

GetPartitionIndexes

getPartitionIndexes_catalogId :: Lens' GetPartitionIndexes (Maybe Text) Source #

The catalog ID where the table resides.

getPartitionIndexes_nextToken :: Lens' GetPartitionIndexes (Maybe Text) Source #

A continuation token, included if this is a continuation call.

getPartitionIndexes_databaseName :: Lens' GetPartitionIndexes Text Source #

Specifies the name of a database from which you want to retrieve partition indexes.

getPartitionIndexes_tableName :: Lens' GetPartitionIndexes Text Source #

Specifies the name of a table for which you want to retrieve the partition indexes.

getPartitionIndexesResponse_nextToken :: Lens' GetPartitionIndexesResponse (Maybe Text) Source #

A continuation token, present if the current list segment is not the last.

GetPartitions

getPartitions_catalogId :: Lens' GetPartitions (Maybe Text) Source #

The ID of the Data Catalog where the partitions in question reside. If none is provided, the Amazon Web Services account ID is used by default.

getPartitions_excludeColumnSchema :: Lens' GetPartitions (Maybe Bool) Source #

When true, specifies not returning the partition column schema. Useful when you are interested only in other partition attributes such as partition values or location. This approach avoids the problem of a large response by not returning duplicate data.

getPartitions_expression :: Lens' GetPartitions (Maybe Text) Source #

An expression that filters the partitions to be returned.

The expression uses SQL syntax similar to the SQL WHERE filter clause. The SQL statement parser JSQLParser parses the expression.

Operators: The following are the operators that you can use in the Expression API call:

=
Checks whether the values of the two operands are equal; if yes, then the condition becomes true.

Example: Assume 'variable a' holds 10 and 'variable b' holds 20.

(a = b) is not true.

< >
Checks whether the values of two operands are equal; if the values are not equal, then the condition becomes true.

Example: (a < > b) is true.

>
Checks whether the value of the left operand is greater than the value of the right operand; if yes, then the condition becomes true.

Example: (a > b) is not true.

<
Checks whether the value of the left operand is less than the value of the right operand; if yes, then the condition becomes true.

Example: (a < b) is true.

>=
Checks whether the value of the left operand is greater than or equal to the value of the right operand; if yes, then the condition becomes true.

Example: (a >= b) is not true.

<=
Checks whether the value of the left operand is less than or equal to the value of the right operand; if yes, then the condition becomes true.

Example: (a <= b) is true.

AND, OR, IN, BETWEEN, LIKE, NOT, IS NULL
Logical operators.

Supported Partition Key Types: The following are the supported partition keys.

  • string
  • date
  • timestamp
  • int
  • bigint
  • long
  • tinyint
  • smallint
  • decimal

If an type is encountered that is not valid, an exception is thrown.

The following list shows the valid operators on each type. When you define a crawler, the partitionKey type is created as a STRING, to be compatible with the catalog partitions.

Sample API Call:

getPartitions_maxResults :: Lens' GetPartitions (Maybe Natural) Source #

The maximum number of partitions to return in a single response.

getPartitions_nextToken :: Lens' GetPartitions (Maybe Text) Source #

A continuation token, if this is not the first call to retrieve these partitions.

getPartitions_queryAsOfTime :: Lens' GetPartitions (Maybe UTCTime) Source #

The time as of when to read the partition contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId.

getPartitions_segment :: Lens' GetPartitions (Maybe Segment) Source #

The segment of the table's partitions to scan in this request.

getPartitions_transactionId :: Lens' GetPartitions (Maybe Text) Source #

The transaction ID at which to read the partition contents.

getPartitions_databaseName :: Lens' GetPartitions Text Source #

The name of the catalog database where the partitions reside.

getPartitions_tableName :: Lens' GetPartitions Text Source #

The name of the partitions' table.

getPartitionsResponse_nextToken :: Lens' GetPartitionsResponse (Maybe Text) Source #

A continuation token, if the returned list of partitions does not include the last one.

GetPlan

getPlan_additionalPlanOptionsMap :: Lens' GetPlan (Maybe (HashMap Text Text)) Source #

A map to hold additional optional key-value parameters.

Currently, these key-value pairs are supported:

  • inferSchema  —  Specifies whether to set inferSchema to true or false for the default script generated by an Glue job. For example, to set inferSchema to true, pass the following key value pair:

    --additional-plan-options-map '{"inferSchema":"true"}'

getPlan_language :: Lens' GetPlan (Maybe Language) Source #

The programming language of the code to perform the mapping.

getPlan_location :: Lens' GetPlan (Maybe Location) Source #

The parameters for the mapping.

getPlan_mapping :: Lens' GetPlan [MappingEntry] Source #

The list of mappings from a source table to target tables.

getPlanResponse_pythonScript :: Lens' GetPlanResponse (Maybe Text) Source #

A Python script to perform the mapping.

getPlanResponse_scalaCode :: Lens' GetPlanResponse (Maybe Text) Source #

The Scala code to perform the mapping.

getPlanResponse_httpStatus :: Lens' GetPlanResponse Int Source #

The response's http status code.

GetRegistry

getRegistry_registryId :: Lens' GetRegistry RegistryId Source #

This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).

getRegistryResponse_createdTime :: Lens' GetRegistryResponse (Maybe Text) Source #

The date and time the registry was created.

getRegistryResponse_registryArn :: Lens' GetRegistryResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the registry.

getRegistryResponse_updatedTime :: Lens' GetRegistryResponse (Maybe Text) Source #

The date and time the registry was updated.

GetResourcePolicies

getResourcePolicies_nextToken :: Lens' GetResourcePolicies (Maybe Text) Source #

A continuation token, if this is a continuation request.

getResourcePoliciesResponse_getResourcePoliciesResponseList :: Lens' GetResourcePoliciesResponse (Maybe [GluePolicy]) Source #

A list of the individual resource policies and the account-level resource policy.

getResourcePoliciesResponse_nextToken :: Lens' GetResourcePoliciesResponse (Maybe Text) Source #

A continuation token, if the returned list does not contain the last resource policy available.

GetResourcePolicy

getResourcePolicy_resourceArn :: Lens' GetResourcePolicy (Maybe Text) Source #

The ARN of the Glue resource for which to retrieve the resource policy. If not supplied, the Data Catalog resource policy is returned. Use GetResourcePolicies to view all existing resource policies. For more information see Specifying Glue Resource ARNs.

getResourcePolicyResponse_createTime :: Lens' GetResourcePolicyResponse (Maybe UTCTime) Source #

The date and time at which the policy was created.

getResourcePolicyResponse_policyHash :: Lens' GetResourcePolicyResponse (Maybe Text) Source #

Contains the hash value associated with this policy.

getResourcePolicyResponse_policyInJson :: Lens' GetResourcePolicyResponse (Maybe Text) Source #

Contains the requested policy document, in JSON format.

getResourcePolicyResponse_updateTime :: Lens' GetResourcePolicyResponse (Maybe UTCTime) Source #

The date and time at which the policy was last updated.

GetSchema

getSchema_schemaId :: Lens' GetSchema SchemaId Source #

This is a wrapper structure to contain schema identity fields. The structure contains:

  • SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.
  • SchemaId$SchemaName: The name of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.

getSchemaResponse_createdTime :: Lens' GetSchemaResponse (Maybe Text) Source #

The date and time the schema was created.

getSchemaResponse_dataFormat :: Lens' GetSchemaResponse (Maybe DataFormat) Source #

The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.

getSchemaResponse_description :: Lens' GetSchemaResponse (Maybe Text) Source #

A description of schema if specified when created

getSchemaResponse_latestSchemaVersion :: Lens' GetSchemaResponse (Maybe Natural) Source #

The latest version of the schema associated with the returned schema definition.

getSchemaResponse_nextSchemaVersion :: Lens' GetSchemaResponse (Maybe Natural) Source #

The next version of the schema associated with the returned schema definition.

getSchemaResponse_registryArn :: Lens' GetSchemaResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the registry.

getSchemaResponse_schemaArn :: Lens' GetSchemaResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the schema.

getSchemaResponse_schemaCheckpoint :: Lens' GetSchemaResponse (Maybe Natural) Source #

The version number of the checkpoint (the last time the compatibility mode was changed).

getSchemaResponse_updatedTime :: Lens' GetSchemaResponse (Maybe Text) Source #

The date and time the schema was updated.

getSchemaResponse_httpStatus :: Lens' GetSchemaResponse Int Source #

The response's http status code.

GetSchemaByDefinition

getSchemaByDefinition_schemaId :: Lens' GetSchemaByDefinition SchemaId Source #

This is a wrapper structure to contain schema identity fields. The structure contains:

  • SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided.
  • SchemaId$SchemaName: The name of the schema. One of SchemaArn or SchemaName has to be provided.

getSchemaByDefinition_schemaDefinition :: Lens' GetSchemaByDefinition Text Source #

The definition of the schema for which schema details are required.

getSchemaByDefinitionResponse_dataFormat :: Lens' GetSchemaByDefinitionResponse (Maybe DataFormat) Source #

The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.

GetSchemaVersion

getSchemaVersion_schemaId :: Lens' GetSchemaVersion (Maybe SchemaId) Source #

This is a wrapper structure to contain schema identity fields. The structure contains:

  • SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.
  • SchemaId$SchemaName: The name of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.

getSchemaVersion_schemaVersionId :: Lens' GetSchemaVersion (Maybe Text) Source #

The SchemaVersionId of the schema version. This field is required for fetching by schema ID. Either this or the SchemaId wrapper has to be provided.

getSchemaVersionResponse_createdTime :: Lens' GetSchemaVersionResponse (Maybe Text) Source #

The date and time the schema version was created.

getSchemaVersionResponse_dataFormat :: Lens' GetSchemaVersionResponse (Maybe DataFormat) Source #

The data format of the schema definition. Currently AVRO, JSON and PROTOBUF are supported.

getSchemaVersionResponse_schemaArn :: Lens' GetSchemaVersionResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the schema.

GetSchemaVersionsDiff

getSchemaVersionsDiff_schemaId :: Lens' GetSchemaVersionsDiff SchemaId Source #

This is a wrapper structure to contain schema identity fields. The structure contains:

  • SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided.
  • SchemaId$SchemaName: The name of the schema. One of SchemaArn or SchemaName has to be provided.

getSchemaVersionsDiff_schemaDiffType :: Lens' GetSchemaVersionsDiff SchemaDiffType Source #

Refers to SYNTAX_DIFF, which is the currently supported diff type.

getSchemaVersionsDiffResponse_diff :: Lens' GetSchemaVersionsDiffResponse (Maybe Text) Source #

The difference between schemas as a string in JsonPatch format.

GetSecurityConfiguration

getSecurityConfiguration_name :: Lens' GetSecurityConfiguration Text Source #

The name of the security configuration to retrieve.

GetSecurityConfigurations

getSecurityConfigurations_nextToken :: Lens' GetSecurityConfigurations (Maybe Text) Source #

A continuation token, if this is a continuation call.

getSecurityConfigurationsResponse_nextToken :: Lens' GetSecurityConfigurationsResponse (Maybe Text) Source #

A continuation token, if there are more security configurations to return.

GetSession

getSession_requestOrigin :: Lens' GetSession (Maybe Text) Source #

The origin of the request.

getSession_id :: Lens' GetSession Text Source #

The ID of the session.

getSessionResponse_session :: Lens' GetSessionResponse (Maybe Session) Source #

The session object is returned in the response.

GetStatement

getStatement_sessionId :: Lens' GetStatement Text Source #

The Session ID of the statement.

getStatement_id :: Lens' GetStatement Int Source #

The Id of the statement.

GetTable

getTable_catalogId :: Lens' GetTable (Maybe Text) Source #

The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default.

getTable_queryAsOfTime :: Lens' GetTable (Maybe UTCTime) Source #

The time as of when to read the table contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId.

getTable_transactionId :: Lens' GetTable (Maybe Text) Source #

The transaction ID at which to read the table contents.

getTable_databaseName :: Lens' GetTable Text Source #

The name of the database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.

getTable_name :: Lens' GetTable Text Source #

The name of the table for which to retrieve the definition. For Hive compatibility, this name is entirely lowercase.

getTableResponse_table :: Lens' GetTableResponse (Maybe Table) Source #

The Table object that defines the specified table.

getTableResponse_httpStatus :: Lens' GetTableResponse Int Source #

The response's http status code.

GetTableVersion

getTableVersion_catalogId :: Lens' GetTableVersion (Maybe Text) Source #

The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.

getTableVersion_versionId :: Lens' GetTableVersion (Maybe Text) Source #

The ID value of the table version to be retrieved. A VersionID is a string representation of an integer. Each version is incremented by 1.

getTableVersion_databaseName :: Lens' GetTableVersion Text Source #

The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.

getTableVersion_tableName :: Lens' GetTableVersion Text Source #

The name of the table. For Hive compatibility, this name is entirely lowercase.

GetTableVersions

getTableVersions_catalogId :: Lens' GetTableVersions (Maybe Text) Source #

The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.

getTableVersions_maxResults :: Lens' GetTableVersions (Maybe Natural) Source #

The maximum number of table versions to return in one response.

getTableVersions_nextToken :: Lens' GetTableVersions (Maybe Text) Source #

A continuation token, if this is not the first call.

getTableVersions_databaseName :: Lens' GetTableVersions Text Source #

The database in the catalog in which the table resides. For Hive compatibility, this name is entirely lowercase.

getTableVersions_tableName :: Lens' GetTableVersions Text Source #

The name of the table. For Hive compatibility, this name is entirely lowercase.

getTableVersionsResponse_nextToken :: Lens' GetTableVersionsResponse (Maybe Text) Source #

A continuation token, if the list of available versions does not include the last one.

getTableVersionsResponse_tableVersions :: Lens' GetTableVersionsResponse (Maybe [TableVersion]) Source #

A list of strings identifying available versions of the specified table.

GetTables

getTables_catalogId :: Lens' GetTables (Maybe Text) Source #

The ID of the Data Catalog where the tables reside. If none is provided, the Amazon Web Services account ID is used by default.

getTables_expression :: Lens' GetTables (Maybe Text) Source #

A regular expression pattern. If present, only those tables whose names match the pattern are returned.

getTables_maxResults :: Lens' GetTables (Maybe Natural) Source #

The maximum number of tables to return in a single response.

getTables_nextToken :: Lens' GetTables (Maybe Text) Source #

A continuation token, included if this is a continuation call.

getTables_queryAsOfTime :: Lens' GetTables (Maybe UTCTime) Source #

The time as of when to read the table contents. If not set, the most recent transaction commit time will be used. Cannot be specified along with TransactionId.

getTables_transactionId :: Lens' GetTables (Maybe Text) Source #

The transaction ID at which to read the table contents.

getTables_databaseName :: Lens' GetTables Text Source #

The database in the catalog whose tables to list. For Hive compatibility, this name is entirely lowercase.

getTablesResponse_nextToken :: Lens' GetTablesResponse (Maybe Text) Source #

A continuation token, present if the current list segment is not the last.

getTablesResponse_tableList :: Lens' GetTablesResponse (Maybe [Table]) Source #

A list of the requested Table objects.

getTablesResponse_httpStatus :: Lens' GetTablesResponse Int Source #

The response's http status code.

GetTags

getTags_resourceArn :: Lens' GetTags Text Source #

The Amazon Resource Name (ARN) of the resource for which to retrieve tags.

getTagsResponse_httpStatus :: Lens' GetTagsResponse Int Source #

The response's http status code.

GetTrigger

getTrigger_name :: Lens' GetTrigger Text Source #

The name of the trigger to retrieve.

GetTriggers

getTriggers_dependentJobName :: Lens' GetTriggers (Maybe Text) Source #

The name of the job to retrieve triggers for. The trigger that can start this job is returned, and if there is no such trigger, all triggers are returned.

getTriggers_maxResults :: Lens' GetTriggers (Maybe Natural) Source #

The maximum size of the response.

getTriggers_nextToken :: Lens' GetTriggers (Maybe Text) Source #

A continuation token, if this is a continuation call.

getTriggersResponse_nextToken :: Lens' GetTriggersResponse (Maybe Text) Source #

A continuation token, if not all the requested triggers have yet been returned.

getTriggersResponse_triggers :: Lens' GetTriggersResponse (Maybe [Trigger]) Source #

A list of triggers for the specified job.

GetUnfilteredPartitionMetadata

GetUnfilteredPartitionsMetadata

GetUnfilteredTableMetadata

GetUserDefinedFunction

getUserDefinedFunction_catalogId :: Lens' GetUserDefinedFunction (Maybe Text) Source #

The ID of the Data Catalog where the function to be retrieved is located. If none is provided, the Amazon Web Services account ID is used by default.

getUserDefinedFunction_databaseName :: Lens' GetUserDefinedFunction Text Source #

The name of the catalog database where the function is located.

GetUserDefinedFunctions

getUserDefinedFunctions_catalogId :: Lens' GetUserDefinedFunctions (Maybe Text) Source #

The ID of the Data Catalog where the functions to be retrieved are located. If none is provided, the Amazon Web Services account ID is used by default.

getUserDefinedFunctions_databaseName :: Lens' GetUserDefinedFunctions (Maybe Text) Source #

The name of the catalog database where the functions are located. If none is provided, functions from all the databases across the catalog will be returned.

getUserDefinedFunctions_maxResults :: Lens' GetUserDefinedFunctions (Maybe Natural) Source #

The maximum number of functions to return in one response.

getUserDefinedFunctions_nextToken :: Lens' GetUserDefinedFunctions (Maybe Text) Source #

A continuation token, if this is a continuation call.

getUserDefinedFunctions_pattern :: Lens' GetUserDefinedFunctions Text Source #

An optional function-name pattern string that filters the function definitions returned.

getUserDefinedFunctionsResponse_nextToken :: Lens' GetUserDefinedFunctionsResponse (Maybe Text) Source #

A continuation token, if the list of functions returned does not include the last requested function.

GetWorkflow

getWorkflow_includeGraph :: Lens' GetWorkflow (Maybe Bool) Source #

Specifies whether to include a graph when returning the workflow resource metadata.

getWorkflow_name :: Lens' GetWorkflow Text Source #

The name of the workflow to retrieve.

getWorkflowResponse_workflow :: Lens' GetWorkflowResponse (Maybe Workflow) Source #

The resource metadata for the workflow.

GetWorkflowRun

getWorkflowRun_includeGraph :: Lens' GetWorkflowRun (Maybe Bool) Source #

Specifies whether to include the workflow graph in response or not.

getWorkflowRun_name :: Lens' GetWorkflowRun Text Source #

Name of the workflow being run.

getWorkflowRun_runId :: Lens' GetWorkflowRun Text Source #

The ID of the workflow run.

GetWorkflowRunProperties

getWorkflowRunProperties_runId :: Lens' GetWorkflowRunProperties Text Source #

The ID of the workflow run whose run properties should be returned.

getWorkflowRunPropertiesResponse_runProperties :: Lens' GetWorkflowRunPropertiesResponse (Maybe (HashMap Text Text)) Source #

The workflow run properties which were set during the specified run.

GetWorkflowRuns

getWorkflowRuns_includeGraph :: Lens' GetWorkflowRuns (Maybe Bool) Source #

Specifies whether to include the workflow graph in response or not.

getWorkflowRuns_maxResults :: Lens' GetWorkflowRuns (Maybe Natural) Source #

The maximum number of workflow runs to be included in the response.

getWorkflowRuns_nextToken :: Lens' GetWorkflowRuns (Maybe Text) Source #

The maximum size of the response.

getWorkflowRuns_name :: Lens' GetWorkflowRuns Text Source #

Name of the workflow whose metadata of runs should be returned.

getWorkflowRunsResponse_nextToken :: Lens' GetWorkflowRunsResponse (Maybe Text) Source #

A continuation token, if not all requested workflow runs have been returned.

ImportCatalogToGlue

importCatalogToGlue_catalogId :: Lens' ImportCatalogToGlue (Maybe Text) Source #

The ID of the catalog to import. Currently, this should be the Amazon Web Services account ID.

ListBlueprints

listBlueprints_maxResults :: Lens' ListBlueprints (Maybe Natural) Source #

The maximum size of a list to return.

listBlueprints_nextToken :: Lens' ListBlueprints (Maybe Text) Source #

A continuation token, if this is a continuation request.

listBlueprints_tags :: Lens' ListBlueprints (Maybe (HashMap Text Text)) Source #

Filters the list by an Amazon Web Services resource tag.

listBlueprintsResponse_blueprints :: Lens' ListBlueprintsResponse (Maybe [Text]) Source #

List of names of blueprints in the account.

listBlueprintsResponse_nextToken :: Lens' ListBlueprintsResponse (Maybe Text) Source #

A continuation token, if not all blueprint names have been returned.

ListCrawlers

listCrawlers_maxResults :: Lens' ListCrawlers (Maybe Natural) Source #

The maximum size of a list to return.

listCrawlers_nextToken :: Lens' ListCrawlers (Maybe Text) Source #

A continuation token, if this is a continuation request.

listCrawlers_tags :: Lens' ListCrawlers (Maybe (HashMap Text Text)) Source #

Specifies to return only these tagged resources.

listCrawlersResponse_crawlerNames :: Lens' ListCrawlersResponse (Maybe [Text]) Source #

The names of all crawlers in the account, or the crawlers with the specified tags.

listCrawlersResponse_nextToken :: Lens' ListCrawlersResponse (Maybe Text) Source #

A continuation token, if the returned list does not contain the last metric available.

ListCrawls

listCrawls_filters :: Lens' ListCrawls (Maybe [CrawlsFilter]) Source #

Filters the crawls by the criteria you specify in a list of CrawlsFilter objects.

listCrawls_maxResults :: Lens' ListCrawls (Maybe Natural) Source #

The maximum number of results to return. The default is 20, and maximum is 100.

listCrawls_nextToken :: Lens' ListCrawls (Maybe Text) Source #

A continuation token, if this is a continuation call.

listCrawls_crawlerName :: Lens' ListCrawls Text Source #

The name of the crawler whose runs you want to retrieve.

listCrawlsResponse_crawls :: Lens' ListCrawlsResponse (Maybe [CrawlerHistory]) Source #

A list of CrawlerHistory objects representing the crawl runs that meet your criteria.

listCrawlsResponse_nextToken :: Lens' ListCrawlsResponse (Maybe Text) Source #

A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.

ListCustomEntityTypes

listCustomEntityTypes_nextToken :: Lens' ListCustomEntityTypes (Maybe Text) Source #

A paginated token to offset the results.

listCustomEntityTypesResponse_customEntityTypes :: Lens' ListCustomEntityTypesResponse (Maybe [CustomEntityType]) Source #

A list of CustomEntityType objects representing custom patterns.

listCustomEntityTypesResponse_nextToken :: Lens' ListCustomEntityTypesResponse (Maybe Text) Source #

A pagination token, if more results are available.

ListDataQualityResults

listDataQualityResults_nextToken :: Lens' ListDataQualityResults (Maybe Text) Source #

A paginated token to offset the results.

listDataQualityResultsResponse_nextToken :: Lens' ListDataQualityResultsResponse (Maybe Text) Source #

A pagination token, if more results are available.

ListDataQualityRuleRecommendationRuns

ListDataQualityRulesetEvaluationRuns

listDataQualityRulesetEvaluationRunsResponse_runs :: Lens' ListDataQualityRulesetEvaluationRunsResponse (Maybe [DataQualityRulesetEvaluationRunDescription]) Source #

A list of DataQualityRulesetEvaluationRunDescription objects representing data quality ruleset runs.

ListDataQualityRulesets

listDataQualityRulesetsResponse_rulesets :: Lens' ListDataQualityRulesetsResponse (Maybe [DataQualityRulesetListDetails]) Source #

A paginated list of rulesets for the specified list of Glue tables.

ListDevEndpoints

listDevEndpoints_maxResults :: Lens' ListDevEndpoints (Maybe Natural) Source #

The maximum size of a list to return.

listDevEndpoints_nextToken :: Lens' ListDevEndpoints (Maybe Text) Source #

A continuation token, if this is a continuation request.

listDevEndpoints_tags :: Lens' ListDevEndpoints (Maybe (HashMap Text Text)) Source #

Specifies to return only these tagged resources.

listDevEndpointsResponse_devEndpointNames :: Lens' ListDevEndpointsResponse (Maybe [Text]) Source #

The names of all the DevEndpoints in the account, or the DevEndpoints with the specified tags.

listDevEndpointsResponse_nextToken :: Lens' ListDevEndpointsResponse (Maybe Text) Source #

A continuation token, if the returned list does not contain the last metric available.

ListJobs

listJobs_maxResults :: Lens' ListJobs (Maybe Natural) Source #

The maximum size of a list to return.

listJobs_nextToken :: Lens' ListJobs (Maybe Text) Source #

A continuation token, if this is a continuation request.

listJobs_tags :: Lens' ListJobs (Maybe (HashMap Text Text)) Source #

Specifies to return only these tagged resources.

listJobsResponse_jobNames :: Lens' ListJobsResponse (Maybe [Text]) Source #

The names of all jobs in the account, or the jobs with the specified tags.

listJobsResponse_nextToken :: Lens' ListJobsResponse (Maybe Text) Source #

A continuation token, if the returned list does not contain the last metric available.

listJobsResponse_httpStatus :: Lens' ListJobsResponse Int Source #

The response's http status code.

ListMLTransforms

listMLTransforms_filter :: Lens' ListMLTransforms (Maybe TransformFilterCriteria) Source #

A TransformFilterCriteria used to filter the machine learning transforms.

listMLTransforms_maxResults :: Lens' ListMLTransforms (Maybe Natural) Source #

The maximum size of a list to return.

listMLTransforms_nextToken :: Lens' ListMLTransforms (Maybe Text) Source #

A continuation token, if this is a continuation request.

listMLTransforms_sort :: Lens' ListMLTransforms (Maybe TransformSortCriteria) Source #

A TransformSortCriteria used to sort the machine learning transforms.

listMLTransforms_tags :: Lens' ListMLTransforms (Maybe (HashMap Text Text)) Source #

Specifies to return only these tagged resources.

listMLTransformsResponse_nextToken :: Lens' ListMLTransformsResponse (Maybe Text) Source #

A continuation token, if the returned list does not contain the last metric available.

listMLTransformsResponse_transformIds :: Lens' ListMLTransformsResponse [Text] Source #

The identifiers of all the machine learning transforms in the account, or the machine learning transforms with the specified tags.

ListRegistries

listRegistries_maxResults :: Lens' ListRegistries (Maybe Natural) Source #

Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.

listRegistries_nextToken :: Lens' ListRegistries (Maybe Text) Source #

A continuation token, if this is a continuation call.

listRegistriesResponse_nextToken :: Lens' ListRegistriesResponse (Maybe Text) Source #

A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.

listRegistriesResponse_registries :: Lens' ListRegistriesResponse (Maybe [RegistryListItem]) Source #

An array of RegistryDetailedListItem objects containing minimal details of each registry.

ListSchemaVersions

listSchemaVersions_maxResults :: Lens' ListSchemaVersions (Maybe Natural) Source #

Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.

listSchemaVersions_nextToken :: Lens' ListSchemaVersions (Maybe Text) Source #

A continuation token, if this is a continuation call.

listSchemaVersions_schemaId :: Lens' ListSchemaVersions SchemaId Source #

This is a wrapper structure to contain schema identity fields. The structure contains:

  • SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.
  • SchemaId$SchemaName: The name of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.

listSchemaVersionsResponse_nextToken :: Lens' ListSchemaVersionsResponse (Maybe Text) Source #

A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.

listSchemaVersionsResponse_schemas :: Lens' ListSchemaVersionsResponse (Maybe [SchemaVersionListItem]) Source #

An array of SchemaVersionList objects containing details of each schema version.

ListSchemas

listSchemas_maxResults :: Lens' ListSchemas (Maybe Natural) Source #

Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.

listSchemas_nextToken :: Lens' ListSchemas (Maybe Text) Source #

A continuation token, if this is a continuation call.

listSchemas_registryId :: Lens' ListSchemas (Maybe RegistryId) Source #

A wrapper structure that may contain the registry name and Amazon Resource Name (ARN).

listSchemasResponse_nextToken :: Lens' ListSchemasResponse (Maybe Text) Source #

A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.

listSchemasResponse_schemas :: Lens' ListSchemasResponse (Maybe [SchemaListItem]) Source #

An array of SchemaListItem objects containing details of each schema.

ListSessions

listSessions_maxResults :: Lens' ListSessions (Maybe Natural) Source #

The maximum number of results.

listSessions_nextToken :: Lens' ListSessions (Maybe Text) Source #

The token for the next set of results, or null if there are no more result.

listSessions_tags :: Lens' ListSessions (Maybe (HashMap Text Text)) Source #

Tags belonging to the session.

listSessionsResponse_nextToken :: Lens' ListSessionsResponse (Maybe Text) Source #

The token for the next set of results, or null if there are no more result.

ListStatements

listStatements_nextToken :: Lens' ListStatements (Maybe Text) Source #

A continuation token, if this is a continuation call.

listStatements_requestOrigin :: Lens' ListStatements (Maybe Text) Source #

The origin of the request to list statements.

listStatements_sessionId :: Lens' ListStatements Text Source #

The Session ID of the statements.

listStatementsResponse_nextToken :: Lens' ListStatementsResponse (Maybe Text) Source #

A continuation token, if not all statements have yet been returned.

ListTriggers

listTriggers_dependentJobName :: Lens' ListTriggers (Maybe Text) Source #

The name of the job for which to retrieve triggers. The trigger that can start this job is returned. If there is no such trigger, all triggers are returned.

listTriggers_maxResults :: Lens' ListTriggers (Maybe Natural) Source #

The maximum size of a list to return.

listTriggers_nextToken :: Lens' ListTriggers (Maybe Text) Source #

A continuation token, if this is a continuation request.

listTriggers_tags :: Lens' ListTriggers (Maybe (HashMap Text Text)) Source #

Specifies to return only these tagged resources.

listTriggersResponse_nextToken :: Lens' ListTriggersResponse (Maybe Text) Source #

A continuation token, if the returned list does not contain the last metric available.

listTriggersResponse_triggerNames :: Lens' ListTriggersResponse (Maybe [Text]) Source #

The names of all triggers in the account, or the triggers with the specified tags.

ListWorkflows

listWorkflows_maxResults :: Lens' ListWorkflows (Maybe Natural) Source #

The maximum size of a list to return.

listWorkflows_nextToken :: Lens' ListWorkflows (Maybe Text) Source #

A continuation token, if this is a continuation request.

listWorkflowsResponse_nextToken :: Lens' ListWorkflowsResponse (Maybe Text) Source #

A continuation token, if not all workflow names have been returned.

listWorkflowsResponse_workflows :: Lens' ListWorkflowsResponse (Maybe (NonEmpty Text)) Source #

List of names of workflows in the account.

PutDataCatalogEncryptionSettings

putDataCatalogEncryptionSettings_catalogId :: Lens' PutDataCatalogEncryptionSettings (Maybe Text) Source #

The ID of the Data Catalog to set the security configuration for. If none is provided, the Amazon Web Services account ID is used by default.

PutResourcePolicy

putResourcePolicy_enableHybrid :: Lens' PutResourcePolicy (Maybe EnableHybridValues) Source #

If 'TRUE', indicates that you are using both methods to grant cross-account access to Data Catalog resources:

  • By directly updating the resource policy with PutResourePolicy
  • By using the Grant permissions command on the Amazon Web Services Management Console.

Must be set to 'TRUE' if you have already used the Management Console to grant cross-account access, otherwise the call fails. Default is 'FALSE'.

putResourcePolicy_policyExistsCondition :: Lens' PutResourcePolicy (Maybe ExistCondition) Source #

A value of MUST_EXIST is used to update a policy. A value of NOT_EXIST is used to create a new policy. If a value of NONE or a null value is used, the call does not depend on the existence of a policy.

putResourcePolicy_policyHashCondition :: Lens' PutResourcePolicy (Maybe Text) Source #

The hash value returned when the previous policy was set using PutResourcePolicy. Its purpose is to prevent concurrent modifications of a policy. Do not use this parameter if no previous policy has been set.

putResourcePolicy_resourceArn :: Lens' PutResourcePolicy (Maybe Text) Source #

Do not use. For internal use only.

putResourcePolicy_policyInJson :: Lens' PutResourcePolicy Text Source #

Contains the policy document to set, in JSON format.

putResourcePolicyResponse_policyHash :: Lens' PutResourcePolicyResponse (Maybe Text) Source #

A hash of the policy that has just been set. This must be included in a subsequent call that overwrites or updates this policy.

PutSchemaVersionMetadata

PutWorkflowRunProperties

putWorkflowRunProperties_runId :: Lens' PutWorkflowRunProperties Text Source #

The ID of the workflow run for which the run properties should be updated.

QuerySchemaVersionMetadata

querySchemaVersionMetadata_maxResults :: Lens' QuerySchemaVersionMetadata (Maybe Natural) Source #

Maximum number of results required per page. If the value is not supplied, this will be defaulted to 25 per page.

querySchemaVersionMetadata_metadataList :: Lens' QuerySchemaVersionMetadata (Maybe [MetadataKeyValuePair]) Source #

Search key-value pairs for metadata, if they are not provided all the metadata information will be fetched.

querySchemaVersionMetadata_nextToken :: Lens' QuerySchemaVersionMetadata (Maybe Text) Source #

A continuation token, if this is a continuation call.

querySchemaVersionMetadata_schemaId :: Lens' QuerySchemaVersionMetadata (Maybe SchemaId) Source #

A wrapper structure that may contain the schema name and Amazon Resource Name (ARN).

querySchemaVersionMetadataResponse_nextToken :: Lens' QuerySchemaVersionMetadataResponse (Maybe Text) Source #

A continuation token for paginating the returned list of tokens, returned if the current segment of the list is not the last.

RegisterSchemaVersion

registerSchemaVersion_schemaId :: Lens' RegisterSchemaVersion SchemaId Source #

This is a wrapper structure to contain schema identity fields. The structure contains:

  • SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.
  • SchemaId$SchemaName: The name of the schema. Either SchemaArn or SchemaName and RegistryName has to be provided.

registerSchemaVersion_schemaDefinition :: Lens' RegisterSchemaVersion Text Source #

The schema definition using the DataFormat setting for the SchemaName.

registerSchemaVersionResponse_schemaVersionId :: Lens' RegisterSchemaVersionResponse (Maybe Text) Source #

The unique ID that represents the version of this schema.

registerSchemaVersionResponse_versionNumber :: Lens' RegisterSchemaVersionResponse (Maybe Natural) Source #

The version of this schema (for sync flow only, in case this is the first version).

RemoveSchemaVersionMetadata

removeSchemaVersionMetadata_schemaId :: Lens' RemoveSchemaVersionMetadata (Maybe SchemaId) Source #

A wrapper structure that may contain the schema name and Amazon Resource Name (ARN).

ResetJobBookmark

resetJobBookmark_runId :: Lens' ResetJobBookmark (Maybe Text) Source #

The unique run identifier associated with this job run.

resetJobBookmark_jobName :: Lens' ResetJobBookmark Text Source #

The name of the job in question.

ResumeWorkflowRun

resumeWorkflowRun_name :: Lens' ResumeWorkflowRun Text Source #

The name of the workflow to resume.

resumeWorkflowRun_runId :: Lens' ResumeWorkflowRun Text Source #

The ID of the workflow run to resume.

resumeWorkflowRun_nodeIds :: Lens' ResumeWorkflowRun [Text] Source #

A list of the node IDs for the nodes you want to restart. The nodes that are to be restarted must have a run attempt in the original run.

resumeWorkflowRunResponse_nodeIds :: Lens' ResumeWorkflowRunResponse (Maybe [Text]) Source #

A list of the node IDs for the nodes that were actually restarted.

resumeWorkflowRunResponse_runId :: Lens' ResumeWorkflowRunResponse (Maybe Text) Source #

The new ID assigned to the resumed workflow run. Each resume of a workflow run will have a new run ID.

RunStatement

runStatement_sessionId :: Lens' RunStatement Text Source #

The Session Id of the statement to be run.

runStatement_code :: Lens' RunStatement Text Source #

The statement code to be run.

runStatementResponse_id :: Lens' RunStatementResponse (Maybe Int) Source #

Returns the Id of the statement that was run.

SearchTables

searchTables_catalogId :: Lens' SearchTables (Maybe Text) Source #

A unique identifier, consisting of account_id .

searchTables_filters :: Lens' SearchTables (Maybe [PropertyPredicate]) Source #

A list of key-value pairs, and a comparator used to filter the search results. Returns all entities matching the predicate.

The Comparator member of the PropertyPredicate struct is used only for time fields, and can be omitted for other field types. Also, when comparing string values, such as when Key=Name, a fuzzy match algorithm is used. The Key field (for example, the value of the Name field) is split on certain punctuation characters, for example, -, :, #, etc. into tokens. Then each token is exact-match compared with the Value member of PropertyPredicate. For example, if Key=Name and Value=link, tables named customer-link and xx-link-yy are returned, but xxlinkyy is not returned.

searchTables_maxResults :: Lens' SearchTables (Maybe Natural) Source #

The maximum number of tables to return in a single response.

searchTables_nextToken :: Lens' SearchTables (Maybe Text) Source #

A continuation token, included if this is a continuation call.

searchTables_resourceShareType :: Lens' SearchTables (Maybe ResourceShareType) Source #

Allows you to specify that you want to search the tables shared with your account. The allowable values are FOREIGN or ALL.

  • If set to FOREIGN, will search the tables shared with your account.
  • If set to ALL, will search the tables shared with your account, as well as the tables in yor local account.

searchTables_searchText :: Lens' SearchTables (Maybe Text) Source #

A string used for a text search.

Specifying a value in quotes filters based on an exact match to the value.

searchTables_sortCriteria :: Lens' SearchTables (Maybe [SortCriterion]) Source #

A list of criteria for sorting the results by a field name, in an ascending or descending order.

searchTablesResponse_nextToken :: Lens' SearchTablesResponse (Maybe Text) Source #

A continuation token, present if the current list segment is not the last.

searchTablesResponse_tableList :: Lens' SearchTablesResponse (Maybe [Table]) Source #

A list of the requested Table objects. The SearchTables response returns only the tables that you have access to.

StartBlueprintRun

startBlueprintRun_parameters :: Lens' StartBlueprintRun (Maybe Text) Source #

Specifies the parameters as a BlueprintParameters object.

startBlueprintRun_roleArn :: Lens' StartBlueprintRun Text Source #

Specifies the IAM role used to create the workflow.

StartCrawler

startCrawler_name :: Lens' StartCrawler Text Source #

Name of the crawler to start.

StartCrawlerSchedule

StartDataQualityRuleRecommendationRun

startDataQualityRuleRecommendationRun_clientToken :: Lens' StartDataQualityRuleRecommendationRun (Maybe Text) Source #

Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.

startDataQualityRuleRecommendationRun_numberOfWorkers :: Lens' StartDataQualityRuleRecommendationRun (Maybe Int) Source #

The number of G.1X workers to be used in the run. The default is 5.

startDataQualityRuleRecommendationRun_timeout :: Lens' StartDataQualityRuleRecommendationRun (Maybe Natural) Source #

The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

StartDataQualityRulesetEvaluationRun

startDataQualityRulesetEvaluationRun_clientToken :: Lens' StartDataQualityRulesetEvaluationRun (Maybe Text) Source #

Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.

startDataQualityRulesetEvaluationRun_numberOfWorkers :: Lens' StartDataQualityRulesetEvaluationRun (Maybe Int) Source #

The number of G.1X workers to be used in the run. The default is 5.

startDataQualityRulesetEvaluationRun_timeout :: Lens' StartDataQualityRulesetEvaluationRun (Maybe Natural) Source #

The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

startDataQualityRulesetEvaluationRun_role :: Lens' StartDataQualityRulesetEvaluationRun Text Source #

An IAM role supplied to encrypt the results of the run.

StartExportLabelsTaskRun

startExportLabelsTaskRun_transformId :: Lens' StartExportLabelsTaskRun Text Source #

The unique identifier of the machine learning transform.

startExportLabelsTaskRun_outputS3Path :: Lens' StartExportLabelsTaskRun Text Source #

The Amazon S3 path where you export the labels.

StartImportLabelsTaskRun

startImportLabelsTaskRun_replaceAllLabels :: Lens' StartImportLabelsTaskRun (Maybe Bool) Source #

Indicates whether to overwrite your existing labels.

startImportLabelsTaskRun_transformId :: Lens' StartImportLabelsTaskRun Text Source #

The unique identifier of the machine learning transform.

startImportLabelsTaskRun_inputS3Path :: Lens' StartImportLabelsTaskRun Text Source #

The Amazon Simple Storage Service (Amazon S3) path from where you import the labels.

StartJobRun

startJobRun_allocatedCapacity :: Lens' StartJobRun (Maybe Int) Source #

This field is deprecated. Use MaxCapacity instead.

The number of Glue data processing units (DPUs) to allocate to this JobRun. You can allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

startJobRun_arguments :: Lens' StartJobRun (Maybe (HashMap Text Text)) Source #

The job arguments specifically for this run. For this job run, they replace the default arguments set in the job definition itself.

You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes.

Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets from a Glue Connection, Secrets Manager or other secret management mechanism if you intend to keep them within the Job.

For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.

For information about the key-value pairs that Glue consumes to set up your job, see the Special Parameters Used by Glue topic in the developer guide.

startJobRun_executionClass :: Lens' StartJobRun (Maybe ExecutionClass) Source #

Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.

The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.

Only jobs with Glue version 3.0 and above and command type glueetl will be allowed to set ExecutionClass to FLEX. The flexible execution class is available for Spark jobs.

startJobRun_jobRunId :: Lens' StartJobRun (Maybe Text) Source #

The ID of a previous JobRun to retry.

startJobRun_maxCapacity :: Lens' StartJobRun (Maybe Double) Source #

The number of Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

Do not set Max Capacity if using WorkerType and NumberOfWorkers.

The value that can be allocated for MaxCapacity depends on whether you are running a Python shell job, or an Apache Spark ETL job:

  • When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
  • When you specify an Apache Spark ETL job (JobCommand.Name="glueetl"), you can allocate a minimum of 2 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.

startJobRun_notificationProperty :: Lens' StartJobRun (Maybe NotificationProperty) Source #

Specifies configuration properties of a job run notification.

startJobRun_numberOfWorkers :: Lens' StartJobRun (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when a job runs.

startJobRun_securityConfiguration :: Lens' StartJobRun (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with this job run.

startJobRun_timeout :: Lens' StartJobRun (Maybe Natural) Source #

The JobRun timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. This value overrides the timeout value set in the parent job.

Streaming jobs do not have a timeout. The default for non-streaming jobs is 2,880 minutes (48 hours).

startJobRun_workerType :: Lens' StartJobRun (Maybe WorkerType) Source #

The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, G.2X, or G.025X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
  • For the G.025X worker type, each worker maps to 0.25 DPU (2 vCPU, 4 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.

startJobRun_jobName :: Lens' StartJobRun Text Source #

The name of the job definition to use.

StartMLEvaluationTaskRun

startMLEvaluationTaskRun_transformId :: Lens' StartMLEvaluationTaskRun Text Source #

The unique identifier of the machine learning transform.

StartMLLabelingSetGenerationTaskRun

startMLLabelingSetGenerationTaskRun_outputS3Path :: Lens' StartMLLabelingSetGenerationTaskRun Text Source #

The Amazon Simple Storage Service (Amazon S3) path where you generate the labeling set.

StartTrigger

startTrigger_name :: Lens' StartTrigger Text Source #

The name of the trigger to start.

startTriggerResponse_name :: Lens' StartTriggerResponse (Maybe Text) Source #

The name of the trigger that was started.

StartWorkflowRun

startWorkflowRun_runProperties :: Lens' StartWorkflowRun (Maybe (HashMap Text Text)) Source #

The workflow run properties for the new workflow run.

startWorkflowRun_name :: Lens' StartWorkflowRun Text Source #

The name of the workflow to start.

StopCrawler

stopCrawler_name :: Lens' StopCrawler Text Source #

Name of the crawler to stop.

StopCrawlerSchedule

stopCrawlerSchedule_crawlerName :: Lens' StopCrawlerSchedule Text Source #

Name of the crawler whose schedule state to set.

StopSession

stopSession_id :: Lens' StopSession Text Source #

The ID of the session to be stopped.

stopSessionResponse_id :: Lens' StopSessionResponse (Maybe Text) Source #

Returns the Id of the stopped session.

StopTrigger

stopTrigger_name :: Lens' StopTrigger Text Source #

The name of the trigger to stop.

stopTriggerResponse_name :: Lens' StopTriggerResponse (Maybe Text) Source #

The name of the trigger that was stopped.

StopWorkflowRun

stopWorkflowRun_name :: Lens' StopWorkflowRun Text Source #

The name of the workflow to stop.

stopWorkflowRun_runId :: Lens' StopWorkflowRun Text Source #

The ID of the workflow run to stop.

TagResource

tagResource_resourceArn :: Lens' TagResource Text Source #

The ARN of the Glue resource to which to add the tags. For more information about Glue resource ARNs, see the Glue ARN string pattern.

tagResource_tagsToAdd :: Lens' TagResource (HashMap Text Text) Source #

Tags to add to this resource.

UntagResource

untagResource_resourceArn :: Lens' UntagResource Text Source #

The Amazon Resource Name (ARN) of the resource from which to remove the tags.

untagResource_tagsToRemove :: Lens' UntagResource [Text] Source #

Tags to remove from this resource.

UpdateBlueprint

updateBlueprint_description :: Lens' UpdateBlueprint (Maybe Text) Source #

A description of the blueprint.

updateBlueprint_name :: Lens' UpdateBlueprint Text Source #

The name of the blueprint.

updateBlueprint_blueprintLocation :: Lens' UpdateBlueprint Text Source #

Specifies a path in Amazon S3 where the blueprint is published.

updateBlueprintResponse_name :: Lens' UpdateBlueprintResponse (Maybe Text) Source #

Returns the name of the blueprint that was updated.

UpdateClassifier

UpdateColumnStatisticsForPartition

updateColumnStatisticsForPartition_catalogId :: Lens' UpdateColumnStatisticsForPartition (Maybe Text) Source #

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

updateColumnStatisticsForPartition_databaseName :: Lens' UpdateColumnStatisticsForPartition Text Source #

The name of the catalog database where the partitions reside.

UpdateColumnStatisticsForTable

updateColumnStatisticsForTable_catalogId :: Lens' UpdateColumnStatisticsForTable (Maybe Text) Source #

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the Amazon Web Services account ID is used by default.

updateColumnStatisticsForTable_databaseName :: Lens' UpdateColumnStatisticsForTable Text Source #

The name of the catalog database where the partitions reside.

UpdateConnection

updateConnection_catalogId :: Lens' UpdateConnection (Maybe Text) Source #

The ID of the Data Catalog in which the connection resides. If none is provided, the Amazon Web Services account ID is used by default.

updateConnection_name :: Lens' UpdateConnection Text Source #

The name of the connection definition to update.

updateConnection_connectionInput :: Lens' UpdateConnection ConnectionInput Source #

A ConnectionInput object that redefines the connection in question.

UpdateCrawler

updateCrawler_classifiers :: Lens' UpdateCrawler (Maybe [Text]) Source #

A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.

updateCrawler_configuration :: Lens' UpdateCrawler (Maybe Text) Source #

Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options.

updateCrawler_crawlerSecurityConfiguration :: Lens' UpdateCrawler (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used by this crawler.

updateCrawler_databaseName :: Lens' UpdateCrawler (Maybe Text) Source #

The Glue database where results are stored, such as: arn:aws:daylight:us-east-1::database/sometable/*.

updateCrawler_description :: Lens' UpdateCrawler (Maybe Text) Source #

A description of the new crawler.

updateCrawler_lakeFormationConfiguration :: Lens' UpdateCrawler (Maybe LakeFormationConfiguration) Source #

Specifies Lake Formation configuration settings for the crawler.

updateCrawler_lineageConfiguration :: Lens' UpdateCrawler (Maybe LineageConfiguration) Source #

Specifies data lineage configuration settings for the crawler.

updateCrawler_recrawlPolicy :: Lens' UpdateCrawler (Maybe RecrawlPolicy) Source #

A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.

updateCrawler_role :: Lens' UpdateCrawler (Maybe Text) Source #

The IAM role or Amazon Resource Name (ARN) of an IAM role that is used by the new crawler to access customer resources.

updateCrawler_schedule :: Lens' UpdateCrawler (Maybe Text) Source #

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

updateCrawler_schemaChangePolicy :: Lens' UpdateCrawler (Maybe SchemaChangePolicy) Source #

The policy for the crawler's update and deletion behavior.

updateCrawler_tablePrefix :: Lens' UpdateCrawler (Maybe Text) Source #

The table prefix used for catalog tables that are created.

updateCrawler_name :: Lens' UpdateCrawler Text Source #

Name of the new crawler.

UpdateCrawlerSchedule

updateCrawlerSchedule_schedule :: Lens' UpdateCrawlerSchedule (Maybe Text) Source #

The updated cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

updateCrawlerSchedule_crawlerName :: Lens' UpdateCrawlerSchedule Text Source #

The name of the crawler whose schedule to update.

UpdateDataQualityRuleset

updateDataQualityRuleset_ruleset :: Lens' UpdateDataQualityRuleset (Maybe Text) Source #

A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.

updateDataQualityRuleset_updatedName :: Lens' UpdateDataQualityRuleset (Maybe Text) Source #

The new name of the ruleset, if you are renaming it.

updateDataQualityRulesetResponse_ruleset :: Lens' UpdateDataQualityRulesetResponse (Maybe Text) Source #

A Data Quality Definition Language (DQDL) ruleset. For more information, see the Glue developer guide.

UpdateDatabase

updateDatabase_catalogId :: Lens' UpdateDatabase (Maybe Text) Source #

The ID of the Data Catalog in which the metadata database resides. If none is provided, the Amazon Web Services account ID is used by default.

updateDatabase_name :: Lens' UpdateDatabase Text Source #

The name of the database to update in the catalog. For Hive compatibility, this is folded to lowercase.

updateDatabase_databaseInput :: Lens' UpdateDatabase DatabaseInput Source #

A DatabaseInput object specifying the new definition of the metadata database in the catalog.

UpdateDevEndpoint

updateDevEndpoint_addArguments :: Lens' UpdateDevEndpoint (Maybe (HashMap Text Text)) Source #

The map of arguments to add the map of arguments used to configure the DevEndpoint.

Valid arguments are:

  • "--enable-glue-datacatalog": ""

You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2.

updateDevEndpoint_addPublicKeys :: Lens' UpdateDevEndpoint (Maybe [Text]) Source #

The list of public keys for the DevEndpoint to use.

updateDevEndpoint_customLibraries :: Lens' UpdateDevEndpoint (Maybe DevEndpointCustomLibraries) Source #

Custom Python or Java libraries to be loaded in the DevEndpoint.

updateDevEndpoint_deleteArguments :: Lens' UpdateDevEndpoint (Maybe [Text]) Source #

The list of argument keys to be deleted from the map of arguments used to configure the DevEndpoint.

updateDevEndpoint_deletePublicKeys :: Lens' UpdateDevEndpoint (Maybe [Text]) Source #

The list of public keys to be deleted from the DevEndpoint.

updateDevEndpoint_publicKey :: Lens' UpdateDevEndpoint (Maybe Text) Source #

The public key for the DevEndpoint to use.

updateDevEndpoint_updateEtlLibraries :: Lens' UpdateDevEndpoint (Maybe Bool) Source #

True if the list of custom libraries to be loaded in the development endpoint needs to be updated, or False if otherwise.

updateDevEndpoint_endpointName :: Lens' UpdateDevEndpoint Text Source #

The name of the DevEndpoint to be updated.

UpdateJob

updateJob_jobName :: Lens' UpdateJob Text Source #

The name of the job definition to update.

updateJob_jobUpdate :: Lens' UpdateJob JobUpdate Source #

Specifies the values with which to update the job definition. Unspecified configuration is removed or reset to default values.

updateJobResponse_jobName :: Lens' UpdateJobResponse (Maybe Text) Source #

Returns the name of the updated job definition.

updateJobResponse_httpStatus :: Lens' UpdateJobResponse Int Source #

The response's http status code.

UpdateJobFromSourceControl

updateJobFromSourceControl_authStrategy :: Lens' UpdateJobFromSourceControl (Maybe SourceControlAuthStrategy) Source #

The type of authentication, which can be an authentication token stored in Amazon Web Services Secrets Manager, or a personal access token.

updateJobFromSourceControl_commitId :: Lens' UpdateJobFromSourceControl (Maybe Text) Source #

A commit ID for a commit in the remote repository.

updateJobFromSourceControl_folder :: Lens' UpdateJobFromSourceControl (Maybe Text) Source #

An optional folder in the remote repository.

updateJobFromSourceControl_jobName :: Lens' UpdateJobFromSourceControl (Maybe Text) Source #

The name of the Glue job to be synchronized to or from the remote repository.

updateJobFromSourceControl_repositoryName :: Lens' UpdateJobFromSourceControl (Maybe Text) Source #

The name of the remote repository that contains the job artifacts.

updateJobFromSourceControl_repositoryOwner :: Lens' UpdateJobFromSourceControl (Maybe Text) Source #

The owner of the remote repository that contains the job artifacts.

UpdateMLTransform

updateMLTransform_description :: Lens' UpdateMLTransform (Maybe Text) Source #

A description of the transform. The default is an empty string.

updateMLTransform_glueVersion :: Lens' UpdateMLTransform (Maybe Text) Source #

This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.

updateMLTransform_maxCapacity :: Lens' UpdateMLTransform (Maybe Double) Source #

The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

updateMLTransform_maxRetries :: Lens' UpdateMLTransform (Maybe Int) Source #

The maximum number of times to retry a task for this transform after a task run fails.

updateMLTransform_name :: Lens' UpdateMLTransform (Maybe Text) Source #

The unique name that you gave the transform when you created it.

updateMLTransform_numberOfWorkers :: Lens' UpdateMLTransform (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when this task runs.

updateMLTransform_parameters :: Lens' UpdateMLTransform (Maybe TransformParameters) Source #

The configuration parameters that are specific to the transform type (algorithm) used. Conditionally dependent on the transform type.

updateMLTransform_role :: Lens' UpdateMLTransform (Maybe Text) Source #

The name or Amazon Resource Name (ARN) of the IAM role with the required permissions.

updateMLTransform_timeout :: Lens' UpdateMLTransform (Maybe Natural) Source #

The timeout for a task run for this transform in minutes. This is the maximum time that a task run for this transform can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

updateMLTransform_workerType :: Lens' UpdateMLTransform (Maybe WorkerType) Source #

The type of predefined worker that is allocated when this task runs. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

updateMLTransform_transformId :: Lens' UpdateMLTransform Text Source #

A unique identifier that was generated when the transform was created.

updateMLTransformResponse_transformId :: Lens' UpdateMLTransformResponse (Maybe Text) Source #

The unique identifier for the transform that was updated.

UpdatePartition

updatePartition_catalogId :: Lens' UpdatePartition (Maybe Text) Source #

The ID of the Data Catalog where the partition to be updated resides. If none is provided, the Amazon Web Services account ID is used by default.

updatePartition_databaseName :: Lens' UpdatePartition Text Source #

The name of the catalog database in which the table in question resides.

updatePartition_tableName :: Lens' UpdatePartition Text Source #

The name of the table in which the partition to be updated is located.

updatePartition_partitionValueList :: Lens' UpdatePartition [Text] Source #

List of partition key values that define the partition to update.

updatePartition_partitionInput :: Lens' UpdatePartition PartitionInput Source #

The new partition object to update the partition to.

The Values property can't be changed. If you want to change the partition key values for a partition, delete and recreate the partition.

UpdateRegistry

updateRegistry_registryId :: Lens' UpdateRegistry RegistryId Source #

This is a wrapper structure that may contain the registry name and Amazon Resource Name (ARN).

updateRegistry_description :: Lens' UpdateRegistry Text Source #

A description of the registry. If description is not provided, this field will not be updated.

updateRegistryResponse_registryArn :: Lens' UpdateRegistryResponse (Maybe Text) Source #

The Amazon Resource name (ARN) of the updated registry.

UpdateSchema

updateSchema_compatibility :: Lens' UpdateSchema (Maybe Compatibility) Source #

The new compatibility setting for the schema.

updateSchema_description :: Lens' UpdateSchema (Maybe Text) Source #

The new description for the schema.

updateSchema_schemaVersionNumber :: Lens' UpdateSchema (Maybe SchemaVersionNumber) Source #

Version number required for check pointing. One of VersionNumber or Compatibility has to be provided.

updateSchema_schemaId :: Lens' UpdateSchema SchemaId Source #

This is a wrapper structure to contain schema identity fields. The structure contains:

  • SchemaId$SchemaArn: The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided.
  • SchemaId$SchemaName: The name of the schema. One of SchemaArn or SchemaName has to be provided.

updateSchemaResponse_registryName :: Lens' UpdateSchemaResponse (Maybe Text) Source #

The name of the registry that contains the schema.

updateSchemaResponse_schemaArn :: Lens' UpdateSchemaResponse (Maybe Text) Source #

The Amazon Resource Name (ARN) of the schema.

UpdateSourceControlFromJob

updateSourceControlFromJob_authStrategy :: Lens' UpdateSourceControlFromJob (Maybe SourceControlAuthStrategy) Source #

The type of authentication, which can be an authentication token stored in Amazon Web Services Secrets Manager, or a personal access token.

updateSourceControlFromJob_commitId :: Lens' UpdateSourceControlFromJob (Maybe Text) Source #

A commit ID for a commit in the remote repository.

updateSourceControlFromJob_folder :: Lens' UpdateSourceControlFromJob (Maybe Text) Source #

An optional folder in the remote repository.

updateSourceControlFromJob_jobName :: Lens' UpdateSourceControlFromJob (Maybe Text) Source #

The name of the Glue job to be synchronized to or from the remote repository.

updateSourceControlFromJob_repositoryName :: Lens' UpdateSourceControlFromJob (Maybe Text) Source #

The name of the remote repository that contains the job artifacts.

updateSourceControlFromJob_repositoryOwner :: Lens' UpdateSourceControlFromJob (Maybe Text) Source #

The owner of the remote repository that contains the job artifacts.

UpdateTable

updateTable_catalogId :: Lens' UpdateTable (Maybe Text) Source #

The ID of the Data Catalog where the table resides. If none is provided, the Amazon Web Services account ID is used by default.

updateTable_skipArchive :: Lens' UpdateTable (Maybe Bool) Source #

By default, UpdateTable always creates an archived version of the table before updating it. However, if skipArchive is set to true, UpdateTable does not create the archived version.

updateTable_transactionId :: Lens' UpdateTable (Maybe Text) Source #

The transaction ID at which to update the table contents.

updateTable_versionId :: Lens' UpdateTable (Maybe Text) Source #

The version ID at which to update the table contents.

updateTable_databaseName :: Lens' UpdateTable Text Source #

The name of the catalog database in which the table resides. For Hive compatibility, this name is entirely lowercase.

updateTable_tableInput :: Lens' UpdateTable TableInput Source #

An updated TableInput object to define the metadata table in the catalog.

UpdateTrigger

updateTrigger_name :: Lens' UpdateTrigger Text Source #

The name of the trigger to update.

updateTrigger_triggerUpdate :: Lens' UpdateTrigger TriggerUpdate Source #

The new values with which to update the trigger.

UpdateUserDefinedFunction

updateUserDefinedFunction_catalogId :: Lens' UpdateUserDefinedFunction (Maybe Text) Source #

The ID of the Data Catalog where the function to be updated is located. If none is provided, the Amazon Web Services account ID is used by default.

updateUserDefinedFunction_databaseName :: Lens' UpdateUserDefinedFunction Text Source #

The name of the catalog database where the function to be updated is located.

updateUserDefinedFunction_functionInput :: Lens' UpdateUserDefinedFunction UserDefinedFunctionInput Source #

A FunctionInput object that redefines the function in the Data Catalog.

UpdateWorkflow

updateWorkflow_defaultRunProperties :: Lens' UpdateWorkflow (Maybe (HashMap Text Text)) Source #

A collection of properties to be used as part of each execution of the workflow.

updateWorkflow_description :: Lens' UpdateWorkflow (Maybe Text) Source #

The description of the workflow.

updateWorkflow_maxConcurrentRuns :: Lens' UpdateWorkflow (Maybe Int) Source #

You can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs.

updateWorkflow_name :: Lens' UpdateWorkflow Text Source #

Name of the workflow to be updated.

updateWorkflowResponse_name :: Lens' UpdateWorkflowResponse (Maybe Text) Source #

The name of the workflow which was specified in input.

Types

Action

action_arguments :: Lens' Action (Maybe (HashMap Text Text)) Source #

The job arguments used when this trigger fires. For this job run, they replace the default arguments set in the job definition itself.

You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes.

For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.

For information about the key-value pairs that Glue consumes to set up your job, see the Special Parameters Used by Glue topic in the developer guide.

action_crawlerName :: Lens' Action (Maybe Text) Source #

The name of the crawler to be used with this action.

action_jobName :: Lens' Action (Maybe Text) Source #

The name of a job to be run.

action_notificationProperty :: Lens' Action (Maybe NotificationProperty) Source #

Specifies configuration properties of a job run notification.

action_securityConfiguration :: Lens' Action (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with this action.

action_timeout :: Lens' Action (Maybe Natural) Source #

The JobRun timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours). This overrides the timeout value set in the parent job.

Aggregate

aggregate_name :: Lens' Aggregate Text Source #

The name of the transform node.

aggregate_inputs :: Lens' Aggregate (NonEmpty Text) Source #

Specifies the fields and rows to use as inputs for the aggregate transform.

aggregate_groups :: Lens' Aggregate [[Text]] Source #

Specifies the fields to group by.

aggregate_aggs :: Lens' Aggregate (NonEmpty AggregateOperation) Source #

Specifies the aggregate functions to be performed on specified fields.

AggregateOperation

aggregateOperation_column :: Lens' AggregateOperation [Text] Source #

Specifies the column on the data set on which the aggregation function will be applied.

aggregateOperation_aggFunc :: Lens' AggregateOperation AggFunction Source #

Specifies the aggregation function to apply.

Possible aggregation functions include: avg countDistinct, count, first, last, kurtosis, max, min, skewness, stddev_samp, stddev_pop, sum, sumDistinct, var_samp, var_pop

ApplyMapping

applyMapping_name :: Lens' ApplyMapping Text Source #

The name of the transform node.

applyMapping_inputs :: Lens' ApplyMapping (NonEmpty Text) Source #

The data inputs identified by their node names.

applyMapping_mapping :: Lens' ApplyMapping [Mapping] Source #

Specifies the mapping of data property keys in the data source to data property keys in the data target.

AthenaConnectorSource

athenaConnectorSource_outputSchemas :: Lens' AthenaConnectorSource (Maybe [GlueSchema]) Source #

Specifies the data schema for the custom Athena source.

athenaConnectorSource_connectionName :: Lens' AthenaConnectorSource Text Source #

The name of the connection that is associated with the connector.

athenaConnectorSource_connectorName :: Lens' AthenaConnectorSource Text Source #

The name of a connector that assists with accessing the data store in Glue Studio.

athenaConnectorSource_connectionType :: Lens' AthenaConnectorSource Text Source #

The type of connection, such as marketplace.athena or custom.athena, designating a connection to an Amazon Athena data store.

athenaConnectorSource_schemaName :: Lens' AthenaConnectorSource Text Source #

The name of the Cloudwatch log group to read from. For example, /aws-glue/jobs/output.

AuditContext

auditContext_requestedColumns :: Lens' AuditContext (Maybe [Text]) Source #

The requested columns for audit.

BackfillError

backfillError_code :: Lens' BackfillError (Maybe BackfillErrorCode) Source #

The error code for an error that occurred when registering partition indexes for an existing table.

backfillError_partitions :: Lens' BackfillError (Maybe [PartitionValueList]) Source #

A list of a limited number of partitions in the response.

BasicCatalogTarget

basicCatalogTarget_inputs :: Lens' BasicCatalogTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

basicCatalogTarget_database :: Lens' BasicCatalogTarget Text Source #

The database that contains the table you want to use as the target. This database must already exist in the Data Catalog.

basicCatalogTarget_table :: Lens' BasicCatalogTarget Text Source #

The table that defines the schema of your output data. This table must already exist in the Data Catalog.

BatchStopJobRunError

batchStopJobRunError_errorDetail :: Lens' BatchStopJobRunError (Maybe ErrorDetail) Source #

Specifies details about the error that was encountered.

batchStopJobRunError_jobName :: Lens' BatchStopJobRunError (Maybe Text) Source #

The name of the job definition that is used in the job run in question.

batchStopJobRunError_jobRunId :: Lens' BatchStopJobRunError (Maybe Text) Source #

The JobRunId of the job run in question.

BatchStopJobRunSuccessfulSubmission

batchStopJobRunSuccessfulSubmission_jobName :: Lens' BatchStopJobRunSuccessfulSubmission (Maybe Text) Source #

The name of the job definition used in the job run that was stopped.

BatchUpdatePartitionFailureEntry

BatchUpdatePartitionRequestEntry

BinaryColumnStatisticsData

binaryColumnStatisticsData_maximumLength :: Lens' BinaryColumnStatisticsData Natural Source #

The size of the longest bit sequence in the column.

Blueprint

blueprint_blueprintLocation :: Lens' Blueprint (Maybe Text) Source #

Specifies the path in Amazon S3 where the blueprint is published.

blueprint_blueprintServiceLocation :: Lens' Blueprint (Maybe Text) Source #

Specifies a path in Amazon S3 where the blueprint is copied when you call CreateBlueprint/UpdateBlueprint to register the blueprint in Glue.

blueprint_createdOn :: Lens' Blueprint (Maybe UTCTime) Source #

The date and time the blueprint was registered.

blueprint_description :: Lens' Blueprint (Maybe Text) Source #

The description of the blueprint.

blueprint_lastActiveDefinition :: Lens' Blueprint (Maybe LastActiveDefinition) Source #

When there are multiple versions of a blueprint and the latest version has some errors, this attribute indicates the last successful blueprint definition that is available with the service.

blueprint_lastModifiedOn :: Lens' Blueprint (Maybe UTCTime) Source #

The date and time the blueprint was last modified.

blueprint_name :: Lens' Blueprint (Maybe Text) Source #

The name of the blueprint.

blueprint_parameterSpec :: Lens' Blueprint (Maybe Text) Source #

A JSON string that indicates the list of parameter specifications for the blueprint.

blueprint_status :: Lens' Blueprint (Maybe BlueprintStatus) Source #

The status of the blueprint registration.

  • Creating — The blueprint registration is in progress.
  • Active — The blueprint has been successfully registered.
  • Updating — An update to the blueprint registration is in progress.
  • Failed — The blueprint registration failed.

BlueprintDetails

blueprintDetails_runId :: Lens' BlueprintDetails (Maybe Text) Source #

The run ID for this blueprint.

BlueprintRun

blueprintRun_completedOn :: Lens' BlueprintRun (Maybe UTCTime) Source #

The date and time that the blueprint run completed.

blueprintRun_errorMessage :: Lens' BlueprintRun (Maybe Text) Source #

Indicates any errors that are seen while running the blueprint.

blueprintRun_parameters :: Lens' BlueprintRun (Maybe Text) Source #

The blueprint parameters as a string. You will have to provide a value for each key that is required from the parameter spec that is defined in the Blueprint$ParameterSpec.

blueprintRun_roleArn :: Lens' BlueprintRun (Maybe Text) Source #

The role ARN. This role will be assumed by the Glue service and will be used to create the workflow and other entities of a workflow.

blueprintRun_rollbackErrorMessage :: Lens' BlueprintRun (Maybe Text) Source #

If there are any errors while creating the entities of a workflow, we try to roll back the created entities until that point and delete them. This attribute indicates the errors seen while trying to delete the entities that are created.

blueprintRun_runId :: Lens' BlueprintRun (Maybe Text) Source #

The run ID for this blueprint run.

blueprintRun_startedOn :: Lens' BlueprintRun (Maybe UTCTime) Source #

The date and time that the blueprint run started.

blueprintRun_state :: Lens' BlueprintRun (Maybe BlueprintRunState) Source #

The state of the blueprint run. Possible values are:

  • Running — The blueprint run is in progress.
  • Succeeded — The blueprint run completed successfully.
  • Failed — The blueprint run failed and rollback is complete.
  • Rolling Back — The blueprint run failed and rollback is in progress.

blueprintRun_workflowName :: Lens' BlueprintRun (Maybe Text) Source #

The name of a workflow that is created as a result of a successful blueprint run. If a blueprint run has an error, there will not be a workflow created.

BooleanColumnStatisticsData

CatalogEntry

catalogEntry_databaseName :: Lens' CatalogEntry Text Source #

The database in which the table metadata resides.

catalogEntry_tableName :: Lens' CatalogEntry Text Source #

The name of the table in question.

CatalogImportStatus

catalogImportStatus_importCompleted :: Lens' CatalogImportStatus (Maybe Bool) Source #

True if the migration has completed, or False otherwise.

catalogImportStatus_importTime :: Lens' CatalogImportStatus (Maybe UTCTime) Source #

The time that the migration was started.

catalogImportStatus_importedBy :: Lens' CatalogImportStatus (Maybe Text) Source #

The name of the person who initiated the migration.

CatalogKafkaSource

catalogKafkaSource_dataPreviewOptions :: Lens' CatalogKafkaSource (Maybe StreamingDataPreviewOptions) Source #

Specifies options related to data preview for viewing a sample of your data.

catalogKafkaSource_detectSchema :: Lens' CatalogKafkaSource (Maybe Bool) Source #

Whether to automatically determine the schema from the incoming data.

catalogKafkaSource_windowSize :: Lens' CatalogKafkaSource (Maybe Natural) Source #

The amount of time to spend processing each micro batch.

catalogKafkaSource_table :: Lens' CatalogKafkaSource Text Source #

The name of the table in the database to read from.

catalogKafkaSource_database :: Lens' CatalogKafkaSource Text Source #

The name of the database to read from.

CatalogKinesisSource

catalogKinesisSource_detectSchema :: Lens' CatalogKinesisSource (Maybe Bool) Source #

Whether to automatically determine the schema from the incoming data.

catalogKinesisSource_windowSize :: Lens' CatalogKinesisSource (Maybe Natural) Source #

The amount of time to spend processing each micro batch.

catalogKinesisSource_table :: Lens' CatalogKinesisSource Text Source #

The name of the table in the database to read from.

catalogKinesisSource_database :: Lens' CatalogKinesisSource Text Source #

The name of the database to read from.

CatalogSchemaChangePolicy

catalogSchemaChangePolicy_enableUpdateCatalog :: Lens' CatalogSchemaChangePolicy (Maybe Bool) Source #

Whether to use the specified update behavior when the crawler finds a changed schema.

catalogSchemaChangePolicy_updateBehavior :: Lens' CatalogSchemaChangePolicy (Maybe UpdateCatalogBehavior) Source #

The update behavior when the crawler finds a changed schema.

CatalogSource

catalogSource_name :: Lens' CatalogSource Text Source #

The name of the data store.

catalogSource_database :: Lens' CatalogSource Text Source #

The name of the database to read from.

catalogSource_table :: Lens' CatalogSource Text Source #

The name of the table in the database to read from.

CatalogTarget

catalogTarget_connectionName :: Lens' CatalogTarget (Maybe Text) Source #

The name of the connection for an Amazon S3-backed Data Catalog table to be a target of the crawl when using a Catalog connection type paired with a NETWORK Connection type.

catalogTarget_dlqEventQueueArn :: Lens' CatalogTarget (Maybe Text) Source #

A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.

catalogTarget_eventQueueArn :: Lens' CatalogTarget (Maybe Text) Source #

A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.

catalogTarget_databaseName :: Lens' CatalogTarget Text Source #

The name of the database to be synchronized.

catalogTarget_tables :: Lens' CatalogTarget (NonEmpty Text) Source #

A list of the tables to be synchronized.

Classifier

classifier_csvClassifier :: Lens' Classifier (Maybe CsvClassifier) Source #

A classifier for comma-separated values (CSV).

CloudWatchEncryption

cloudWatchEncryption_kmsKeyArn :: Lens' CloudWatchEncryption (Maybe Text) Source #

The Amazon Resource Name (ARN) of the KMS key to be used to encrypt the data.

CodeGenConfigurationNode

codeGenConfigurationNode_aggregate :: Lens' CodeGenConfigurationNode (Maybe Aggregate) Source #

Specifies a transform that groups rows by chosen fields and computes the aggregated value by specified function.

codeGenConfigurationNode_applyMapping :: Lens' CodeGenConfigurationNode (Maybe ApplyMapping) Source #

Specifies a transform that maps data property keys in the data source to data property keys in the data target. You can rename keys, modify the data types for keys, and choose which keys to drop from the dataset.

codeGenConfigurationNode_catalogKafkaSource :: Lens' CodeGenConfigurationNode (Maybe CatalogKafkaSource) Source #

Specifies an Apache Kafka data store in the Data Catalog.

codeGenConfigurationNode_catalogTarget :: Lens' CodeGenConfigurationNode (Maybe BasicCatalogTarget) Source #

Specifies a target that uses a Glue Data Catalog table.

codeGenConfigurationNode_customCode :: Lens' CodeGenConfigurationNode (Maybe CustomCode) Source #

Specifies a transform that uses custom code you provide to perform the data transformation. The output is a collection of DynamicFrames.

codeGenConfigurationNode_dropDuplicates :: Lens' CodeGenConfigurationNode (Maybe DropDuplicates) Source #

Specifies a transform that removes rows of repeating data from a data set.

codeGenConfigurationNode_dropFields :: Lens' CodeGenConfigurationNode (Maybe DropFields) Source #

Specifies a transform that chooses the data property keys that you want to drop.

codeGenConfigurationNode_dropNullFields :: Lens' CodeGenConfigurationNode (Maybe DropNullFields) Source #

Specifies a transform that removes columns from the dataset if all values in the column are 'null'. By default, Glue Studio will recognize null objects, but some values such as empty strings, strings that are "null", -1 integers or other placeholders such as zeros, are not automatically recognized as nulls.

codeGenConfigurationNode_fillMissingValues :: Lens' CodeGenConfigurationNode (Maybe FillMissingValues) Source #

Specifies a transform that locates records in the dataset that have missing values and adds a new field with a value determined by imputation. The input data set is used to train the machine learning model that determines what the missing value should be.

codeGenConfigurationNode_filter :: Lens' CodeGenConfigurationNode (Maybe Filter) Source #

Specifies a transform that splits a dataset into two, based on a filter condition.

codeGenConfigurationNode_jDBCConnectorTarget :: Lens' CodeGenConfigurationNode (Maybe JDBCConnectorTarget) Source #

Specifies a data target that writes to Amazon S3 in Apache Parquet columnar storage.

codeGenConfigurationNode_join :: Lens' CodeGenConfigurationNode (Maybe Join) Source #

Specifies a transform that joins two datasets into one dataset using a comparison phrase on the specified data property keys. You can use inner, outer, left, right, left semi, and left anti joins.

codeGenConfigurationNode_merge :: Lens' CodeGenConfigurationNode (Maybe Merge) Source #

Specifies a transform that merges a DynamicFrame with a staging DynamicFrame based on the specified primary keys to identify records. Duplicate records (records with the same primary keys) are not de-duplicated.

codeGenConfigurationNode_pIIDetection :: Lens' CodeGenConfigurationNode (Maybe PIIDetection) Source #

Specifies a transform that identifies, removes or masks PII data.

codeGenConfigurationNode_renameField :: Lens' CodeGenConfigurationNode (Maybe RenameField) Source #

Specifies a transform that renames a single data property key.

codeGenConfigurationNode_s3CatalogSource :: Lens' CodeGenConfigurationNode (Maybe S3CatalogSource) Source #

Specifies an Amazon S3 data store in the Glue Data Catalog.

codeGenConfigurationNode_s3CatalogTarget :: Lens' CodeGenConfigurationNode (Maybe S3CatalogTarget) Source #

Specifies a data target that writes to Amazon S3 using the Glue Data Catalog.

codeGenConfigurationNode_s3CsvSource :: Lens' CodeGenConfigurationNode (Maybe S3CsvSource) Source #

Specifies a command-separated value (CSV) data store stored in Amazon S3.

codeGenConfigurationNode_s3GlueParquetTarget :: Lens' CodeGenConfigurationNode (Maybe S3GlueParquetTarget) Source #

Specifies a data target that writes to Amazon S3 in Apache Parquet columnar storage.

codeGenConfigurationNode_s3ParquetSource :: Lens' CodeGenConfigurationNode (Maybe S3ParquetSource) Source #

Specifies an Apache Parquet data store stored in Amazon S3.

codeGenConfigurationNode_selectFields :: Lens' CodeGenConfigurationNode (Maybe SelectFields) Source #

Specifies a transform that chooses the data property keys that you want to keep.

codeGenConfigurationNode_selectFromCollection :: Lens' CodeGenConfigurationNode (Maybe SelectFromCollection) Source #

Specifies a transform that chooses one DynamicFrame from a collection of DynamicFrames. The output is the selected DynamicFrame

codeGenConfigurationNode_sparkSQL :: Lens' CodeGenConfigurationNode (Maybe SparkSQL) Source #

Specifies a transform where you enter a SQL query using Spark SQL syntax to transform the data. The output is a single DynamicFrame.

codeGenConfigurationNode_spigot :: Lens' CodeGenConfigurationNode (Maybe Spigot) Source #

Specifies a transform that writes samples of the data to an Amazon S3 bucket.

codeGenConfigurationNode_splitFields :: Lens' CodeGenConfigurationNode (Maybe SplitFields) Source #

Specifies a transform that splits data property keys into two DynamicFrames. The output is a collection of DynamicFrames: one with selected data property keys, and one with the remaining data property keys.

codeGenConfigurationNode_union :: Lens' CodeGenConfigurationNode (Maybe Union) Source #

Specifies a transform that combines the rows from two or more datasets into a single result.

CodeGenEdge

codeGenEdge_source :: Lens' CodeGenEdge Text Source #

The ID of the node at which the edge starts.

codeGenEdge_target :: Lens' CodeGenEdge Text Source #

The ID of the node at which the edge ends.

CodeGenNode

codeGenNode_lineNumber :: Lens' CodeGenNode (Maybe Int) Source #

The line number of the node.

codeGenNode_id :: Lens' CodeGenNode Text Source #

A node identifier that is unique within the node's graph.

codeGenNode_nodeType :: Lens' CodeGenNode Text Source #

The type of node that this is.

codeGenNode_args :: Lens' CodeGenNode [CodeGenNodeArg] Source #

Properties of the node, in the form of name-value pairs.

CodeGenNodeArg

codeGenNodeArg_param :: Lens' CodeGenNodeArg (Maybe Bool) Source #

True if the value is used as a parameter.

codeGenNodeArg_name :: Lens' CodeGenNodeArg Text Source #

The name of the argument or property.

codeGenNodeArg_value :: Lens' CodeGenNodeArg Text Source #

The value of the argument or property.

Column

column_comment :: Lens' Column (Maybe Text) Source #

A free-form text comment.

column_parameters :: Lens' Column (Maybe (HashMap Text Text)) Source #

These key-value pairs define properties associated with the column.

column_type :: Lens' Column (Maybe Text) Source #

The data type of the Column.

column_name :: Lens' Column Text Source #

The name of the Column.

ColumnError

columnError_columnName :: Lens' ColumnError (Maybe Text) Source #

The name of the column that failed.

columnError_error :: Lens' ColumnError (Maybe ErrorDetail) Source #

An error message with the reason for the failure of an operation.

ColumnImportance

columnImportance_importance :: Lens' ColumnImportance (Maybe Double) Source #

The column importance score for the column, as a decimal.

ColumnRowFilter

ColumnStatistics

columnStatistics_columnName :: Lens' ColumnStatistics Text Source #

Name of column which statistics belong to.

columnStatistics_analyzedTime :: Lens' ColumnStatistics UTCTime Source #

The timestamp of when column statistics were generated.

columnStatistics_statisticsData :: Lens' ColumnStatistics ColumnStatisticsData Source #

A ColumnStatisticData object that contains the statistics data values.

ColumnStatisticsData

ColumnStatisticsError

columnStatisticsError_error :: Lens' ColumnStatisticsError (Maybe ErrorDetail) Source #

An error message with the reason for the failure of an operation.

Condition

condition_crawlState :: Lens' Condition (Maybe CrawlState) Source #

The state of the crawler to which this condition applies.

condition_crawlerName :: Lens' Condition (Maybe Text) Source #

The name of the crawler to which this condition applies.

condition_jobName :: Lens' Condition (Maybe Text) Source #

The name of the job whose JobRuns this condition applies to, and on which this trigger waits.

condition_state :: Lens' Condition (Maybe JobRunState) Source #

The condition state. Currently, the only job states that a trigger can listen for are SUCCEEDED, STOPPED, FAILED, and TIMEOUT. The only crawler states that a trigger can listen for are SUCCEEDED, FAILED, and CANCELLED.

ConfusionMatrix

confusionMatrix_numFalseNegatives :: Lens' ConfusionMatrix (Maybe Integer) Source #

The number of matches in the data that the transform didn't find, in the confusion matrix for your transform.

confusionMatrix_numFalsePositives :: Lens' ConfusionMatrix (Maybe Integer) Source #

The number of nonmatches in the data that the transform incorrectly classified as a match, in the confusion matrix for your transform.

confusionMatrix_numTrueNegatives :: Lens' ConfusionMatrix (Maybe Integer) Source #

The number of nonmatches in the data that the transform correctly rejected, in the confusion matrix for your transform.

confusionMatrix_numTruePositives :: Lens' ConfusionMatrix (Maybe Integer) Source #

The number of matches in the data that the transform correctly found, in the confusion matrix for your transform.

Connection

connection_connectionProperties :: Lens' Connection (Maybe (HashMap ConnectionPropertyKey Text)) Source #

These key-value pairs define parameters for the connection:

  • HOST - The host URI: either the fully qualified domain name (FQDN) or the IPv4 address of the database host.
  • PORT - The port number, between 1024 and 65535, of the port on which the database host is listening for database connections.
  • USER_NAME - The name under which to log in to the database. The value string for USER_NAME is "USERNAME".
  • PASSWORD - A password, if one is used, for the user name.
  • ENCRYPTED_PASSWORD - When you enable connection password protection by setting ConnectionPasswordEncryption in the Data Catalog encryption settings, this field stores the encrypted password.
  • JDBC_DRIVER_JAR_URI - The Amazon Simple Storage Service (Amazon S3) path of the JAR file that contains the JDBC driver to use.
  • JDBC_DRIVER_CLASS_NAME - The class name of the JDBC driver to use.
  • JDBC_ENGINE - The name of the JDBC engine to use.
  • JDBC_ENGINE_VERSION - The version of the JDBC engine to use.
  • CONFIG_FILES - (Reserved for future use.)
  • INSTANCE_ID - The instance ID to use.
  • JDBC_CONNECTION_URL - The URL for connecting to a JDBC data source.
  • JDBC_ENFORCE_SSL - A Boolean string (true, false) specifying whether Secure Sockets Layer (SSL) with hostname matching is enforced for the JDBC connection on the client. The default is false.
  • CUSTOM_JDBC_CERT - An Amazon S3 location specifying the customer's root certificate. Glue uses this root certificate to validate the customer’s certificate when connecting to the customer database. Glue only handles X.509 certificates. The certificate provided must be DER-encoded and supplied in Base64 encoding PEM format.
  • SKIP_CUSTOM_JDBC_CERT_VALIDATION - By default, this is false. Glue validates the Signature algorithm and Subject Public Key Algorithm for the customer certificate. The only permitted algorithms for the Signature algorithm are SHA256withRSA, SHA384withRSA or SHA512withRSA. For the Subject Public Key Algorithm, the key length must be at least 2048. You can set the value of this property to true to skip Glue’s validation of the customer certificate.
  • CUSTOM_JDBC_CERT_STRING - A custom JDBC certificate string which is used for domain match or distinguished name match to prevent a man-in-the-middle attack. In Oracle database, this is used as the SSL_SERVER_CERT_DN; in Microsoft SQL Server, this is used as the hostNameInCertificate.
  • CONNECTION_URL - The URL for connecting to a general (non-JDBC) data source.
  • SECRET_ID - The secret ID used for the secret manager of credentials.
  • CONNECTOR_URL - The connector URL for a MARKETPLACE or CUSTOM connection.
  • CONNECTOR_TYPE - The connector type for a MARKETPLACE or CUSTOM connection.
  • CONNECTOR_CLASS_NAME - The connector class name for a MARKETPLACE or CUSTOM connection.
  • KAFKA_BOOTSTRAP_SERVERS - A comma-separated list of host and port pairs that are the addresses of the Apache Kafka brokers in a Kafka cluster to which a Kafka client will connect to and bootstrap itself.
  • KAFKA_SSL_ENABLED - Whether to enable or disable SSL on an Apache Kafka connection. Default value is "true".
  • KAFKA_CUSTOM_CERT - The Amazon S3 URL for the private CA cert file (.pem format). The default is an empty string.
  • KAFKA_SKIP_CUSTOM_CERT_VALIDATION - Whether to skip the validation of the CA cert file or not. Glue validates for three algorithms: SHA256withRSA, SHA384withRSA and SHA512withRSA. Default value is "false".
  • KAFKA_CLIENT_KEYSTORE - The Amazon S3 location of the client keystore file for Kafka client side authentication (Optional).
  • KAFKA_CLIENT_KEYSTORE_PASSWORD - The password to access the provided keystore (Optional).
  • KAFKA_CLIENT_KEY_PASSWORD - A keystore can consist of multiple keys, so this is the password to access the client key to be used with the Kafka server side key (Optional).
  • ENCRYPTED_KAFKA_CLIENT_KEYSTORE_PASSWORD - The encrypted version of the Kafka client keystore password (if the user has the Glue encrypt passwords setting selected).
  • ENCRYPTED_KAFKA_CLIENT_KEY_PASSWORD - The encrypted version of the Kafka client key password (if the user has the Glue encrypt passwords setting selected).
  • KAFKA_SASL_MECHANISM - "SCRAM-SHA-512" or "GSSAPI". These are the two supported SASL Mechanisms.
  • KAFKA_SASL_SCRAM_USERNAME - A plaintext username used to authenticate with the "SCRAM-SHA-512" mechanism.
  • KAFKA_SASL_SCRAM_PASSWORD - A plaintext password used to authenticate with the "SCRAM-SHA-512" mechanism.
  • ENCRYPTED_KAFKA_SASL_SCRAM_PASSWORD - The encrypted version of the Kafka SASL SCRAM password (if the user has the Glue encrypt passwords setting selected).
  • KAFKA_SASL_GSSAPI_KEYTAB - The S3 location of a Kerberos keytab file. A keytab stores long-term keys for one or more principals. For more information, see MIT Kerberos Documentation: Keytab.
  • KAFKA_SASL_GSSAPI_KRB5_CONF - The S3 location of a Kerberos krb5.conf file. A krb5.conf stores Kerberos configuration information, such as the location of the KDC server. For more information, see MIT Kerberos Documentation: krb5.conf.
  • KAFKA_SASL_GSSAPI_SERVICE - The Kerberos service name, as set with sasl.kerberos.service.name in your Kafka Configuration.
  • KAFKA_SASL_GSSAPI_PRINCIPAL - The name of the Kerberos princial used by Glue. For more information, see Kafka Documentation: Configuring Kafka Brokers.

connection_connectionType :: Lens' Connection (Maybe ConnectionType) Source #

The type of the connection. Currently, SFTP is not supported.

connection_creationTime :: Lens' Connection (Maybe UTCTime) Source #

The time that this connection definition was created.

connection_description :: Lens' Connection (Maybe Text) Source #

The description of the connection.

connection_lastUpdatedBy :: Lens' Connection (Maybe Text) Source #

The user, group, or role that last updated this connection definition.

connection_lastUpdatedTime :: Lens' Connection (Maybe UTCTime) Source #

The last time that this connection definition was updated.

connection_matchCriteria :: Lens' Connection (Maybe [Text]) Source #

A list of criteria that can be used in selecting this connection.

connection_name :: Lens' Connection (Maybe Text) Source #

The name of the connection definition.

connection_physicalConnectionRequirements :: Lens' Connection (Maybe PhysicalConnectionRequirements) Source #

A map of physical connection requirements, such as virtual private cloud (VPC) and SecurityGroup, that are needed to make this connection successfully.

ConnectionInput

connectionInput_description :: Lens' ConnectionInput (Maybe Text) Source #

The description of the connection.

connectionInput_matchCriteria :: Lens' ConnectionInput (Maybe [Text]) Source #

A list of criteria that can be used in selecting this connection.

connectionInput_physicalConnectionRequirements :: Lens' ConnectionInput (Maybe PhysicalConnectionRequirements) Source #

A map of physical connection requirements, such as virtual private cloud (VPC) and SecurityGroup, that are needed to successfully make this connection.

connectionInput_name :: Lens' ConnectionInput Text Source #

The name of the connection.

connectionInput_connectionType :: Lens' ConnectionInput ConnectionType Source #

The type of the connection. Currently, these types are supported:

  • JDBC - Designates a connection to a database through Java Database Connectivity (JDBC).
  • KAFKA - Designates a connection to an Apache Kafka streaming platform.
  • MONGODB - Designates a connection to a MongoDB document database.
  • NETWORK - Designates a network connection to a data source within an Amazon Virtual Private Cloud environment (Amazon VPC).
  • MARKETPLACE - Uses configuration settings contained in a connector purchased from Amazon Web Services Marketplace to read from and write to data stores that are not natively supported by Glue.
  • CUSTOM - Uses configuration settings contained in a custom connector to read from and write to data stores that are not natively supported by Glue.

SFTP is not supported.

connectionInput_connectionProperties :: Lens' ConnectionInput (HashMap ConnectionPropertyKey Text) Source #

These key-value pairs define parameters for the connection.

ConnectionPasswordEncryption

connectionPasswordEncryption_awsKmsKeyId :: Lens' ConnectionPasswordEncryption (Maybe Text) Source #

An KMS key that is used to encrypt the connection password.

If connection password protection is enabled, the caller of CreateConnection and UpdateConnection needs at least kms:Encrypt permission on the specified KMS key, to encrypt passwords before storing them in the Data Catalog.

You can set the decrypt permission to enable or restrict access on the password key according to your security requirements.

connectionPasswordEncryption_returnConnectionPasswordEncrypted :: Lens' ConnectionPasswordEncryption Bool Source #

When the ReturnConnectionPasswordEncrypted flag is set to "true", passwords remain encrypted in the responses of GetConnection and GetConnections. This encryption takes effect independently from catalog encryption.

ConnectionsList

connectionsList_connections :: Lens' ConnectionsList (Maybe [Text]) Source #

A list of connections used by the job.

Crawl

crawl_completedOn :: Lens' Crawl (Maybe UTCTime) Source #

The date and time on which the crawl completed.

crawl_errorMessage :: Lens' Crawl (Maybe Text) Source #

The error message associated with the crawl.

crawl_logGroup :: Lens' Crawl (Maybe Text) Source #

The log group associated with the crawl.

crawl_logStream :: Lens' Crawl (Maybe Text) Source #

The log stream associated with the crawl.

crawl_startedOn :: Lens' Crawl (Maybe UTCTime) Source #

The date and time on which the crawl started.

crawl_state :: Lens' Crawl (Maybe CrawlState) Source #

The state of the crawler.

Crawler

crawler_classifiers :: Lens' Crawler (Maybe [Text]) Source #

A list of UTF-8 strings that specify the custom classifiers that are associated with the crawler.

crawler_configuration :: Lens' Crawler (Maybe Text) Source #

Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options.

crawler_crawlElapsedTime :: Lens' Crawler (Maybe Integer) Source #

If the crawler is running, contains the total time elapsed since the last crawl began.

crawler_crawlerSecurityConfiguration :: Lens' Crawler (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used by this crawler.

crawler_creationTime :: Lens' Crawler (Maybe UTCTime) Source #

The time that the crawler was created.

crawler_databaseName :: Lens' Crawler (Maybe Text) Source #

The name of the database in which the crawler's output is stored.

crawler_description :: Lens' Crawler (Maybe Text) Source #

A description of the crawler.

crawler_lakeFormationConfiguration :: Lens' Crawler (Maybe LakeFormationConfiguration) Source #

Specifies whether the crawler should use Lake Formation credentials for the crawler instead of the IAM role credentials.

crawler_lastCrawl :: Lens' Crawler (Maybe LastCrawlInfo) Source #

The status of the last crawl, and potentially error information if an error occurred.

crawler_lastUpdated :: Lens' Crawler (Maybe UTCTime) Source #

The time that the crawler was last updated.

crawler_lineageConfiguration :: Lens' Crawler (Maybe LineageConfiguration) Source #

A configuration that specifies whether data lineage is enabled for the crawler.

crawler_name :: Lens' Crawler (Maybe Text) Source #

The name of the crawler.

crawler_recrawlPolicy :: Lens' Crawler (Maybe RecrawlPolicy) Source #

A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.

crawler_role :: Lens' Crawler (Maybe Text) Source #

The Amazon Resource Name (ARN) of an IAM role that's used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.

crawler_schedule :: Lens' Crawler (Maybe Schedule) Source #

For scheduled crawlers, the schedule when the crawler runs.

crawler_schemaChangePolicy :: Lens' Crawler (Maybe SchemaChangePolicy) Source #

The policy that specifies update and delete behaviors for the crawler.

crawler_state :: Lens' Crawler (Maybe CrawlerState) Source #

Indicates whether the crawler is running, or whether a run is pending.

crawler_tablePrefix :: Lens' Crawler (Maybe Text) Source #

The prefix added to the names of tables that are created.

crawler_targets :: Lens' Crawler (Maybe CrawlerTargets) Source #

A collection of targets to crawl.

crawler_version :: Lens' Crawler (Maybe Integer) Source #

The version of the crawler.

CrawlerHistory

crawlerHistory_crawlId :: Lens' CrawlerHistory (Maybe Text) Source #

A UUID identifier for each crawl.

crawlerHistory_dPUHour :: Lens' CrawlerHistory (Maybe Double) Source #

The number of data processing units (DPU) used in hours for the crawl.

crawlerHistory_endTime :: Lens' CrawlerHistory (Maybe UTCTime) Source #

The date and time on which the crawl ended.

crawlerHistory_errorMessage :: Lens' CrawlerHistory (Maybe Text) Source #

If an error occurred, the error message associated with the crawl.

crawlerHistory_logGroup :: Lens' CrawlerHistory (Maybe Text) Source #

The log group associated with the crawl.

crawlerHistory_logStream :: Lens' CrawlerHistory (Maybe Text) Source #

The log stream associated with the crawl.

crawlerHistory_messagePrefix :: Lens' CrawlerHistory (Maybe Text) Source #

The prefix for a CloudWatch message about this crawl.

crawlerHistory_startTime :: Lens' CrawlerHistory (Maybe UTCTime) Source #

The date and time on which the crawl started.

crawlerHistory_summary :: Lens' CrawlerHistory (Maybe Text) Source #

A run summary for the specific crawl in JSON. Contains the catalog tables and partitions that were added, updated, or deleted.

CrawlerMetrics

crawlerMetrics_lastRuntimeSeconds :: Lens' CrawlerMetrics (Maybe Double) Source #

The duration of the crawler's most recent run, in seconds.

crawlerMetrics_medianRuntimeSeconds :: Lens' CrawlerMetrics (Maybe Double) Source #

The median duration of this crawler's runs, in seconds.

crawlerMetrics_stillEstimating :: Lens' CrawlerMetrics (Maybe Bool) Source #

True if the crawler is still estimating how long it will take to complete this run.

crawlerMetrics_tablesCreated :: Lens' CrawlerMetrics (Maybe Natural) Source #

The number of tables created by this crawler.

crawlerMetrics_tablesDeleted :: Lens' CrawlerMetrics (Maybe Natural) Source #

The number of tables deleted by this crawler.

crawlerMetrics_tablesUpdated :: Lens' CrawlerMetrics (Maybe Natural) Source #

The number of tables updated by this crawler.

crawlerMetrics_timeLeftSeconds :: Lens' CrawlerMetrics (Maybe Double) Source #

The estimated time left to complete a running crawl.

CrawlerNodeDetails

crawlerNodeDetails_crawls :: Lens' CrawlerNodeDetails (Maybe [Crawl]) Source #

A list of crawls represented by the crawl node.

CrawlerTargets

crawlerTargets_catalogTargets :: Lens' CrawlerTargets (Maybe [CatalogTarget]) Source #

Specifies Glue Data Catalog targets.

crawlerTargets_deltaTargets :: Lens' CrawlerTargets (Maybe [DeltaTarget]) Source #

Specifies Delta data store targets.

crawlerTargets_mongoDBTargets :: Lens' CrawlerTargets (Maybe [MongoDBTarget]) Source #

Specifies Amazon DocumentDB or MongoDB targets.

crawlerTargets_s3Targets :: Lens' CrawlerTargets (Maybe [S3Target]) Source #

Specifies Amazon Simple Storage Service (Amazon S3) targets.

CrawlsFilter

crawlsFilter_fieldName :: Lens' CrawlsFilter (Maybe FieldName) Source #

A key used to filter the crawler runs for a specified crawler. Valid values for each of the field names are:

  • CRAWL_ID: A string representing the UUID identifier for a crawl.
  • STATE: A string representing the state of the crawl.
  • START_TIME and END_TIME: The epoch timestamp in milliseconds.
  • DPU_HOUR: The number of data processing unit (DPU) hours used for the crawl.

crawlsFilter_fieldValue :: Lens' CrawlsFilter (Maybe Text) Source #

The value provided for comparison on the crawl field.

crawlsFilter_filterOperator :: Lens' CrawlsFilter (Maybe FilterOperator) Source #

A defined comparator that operates on the value. The available operators are:

  • GT: Greater than.
  • GE: Greater than or equal to.
  • LT: Less than.
  • LE: Less than or equal to.
  • EQ: Equal to.
  • NE: Not equal to.

CreateCsvClassifierRequest

createCsvClassifierRequest_allowSingleColumn :: Lens' CreateCsvClassifierRequest (Maybe Bool) Source #

Enables the processing of files that contain only one column.

createCsvClassifierRequest_delimiter :: Lens' CreateCsvClassifierRequest (Maybe Text) Source #

A custom symbol to denote what separates each column entry in the row.

createCsvClassifierRequest_disableValueTrimming :: Lens' CreateCsvClassifierRequest (Maybe Bool) Source #

Specifies not to trim values before identifying the type of column values. The default value is true.

createCsvClassifierRequest_header :: Lens' CreateCsvClassifierRequest (Maybe [Text]) Source #

A list of strings representing column names.

createCsvClassifierRequest_quoteSymbol :: Lens' CreateCsvClassifierRequest (Maybe Text) Source #

A custom symbol to denote what combines content into a single column value. Must be different from the column delimiter.

CreateGrokClassifierRequest

createGrokClassifierRequest_customPatterns :: Lens' CreateGrokClassifierRequest (Maybe Text) Source #

Optional custom grok patterns used by this classifier.

createGrokClassifierRequest_classification :: Lens' CreateGrokClassifierRequest Text Source #

An identifier of the data format that the classifier matches, such as Twitter, JSON, Omniture logs, Amazon CloudWatch Logs, and so on.

CreateJsonClassifierRequest

createJsonClassifierRequest_jsonPath :: Lens' CreateJsonClassifierRequest Text Source #

A JsonPath string defining the JSON data for the classifier to classify. Glue supports a subset of JsonPath, as described in Writing JsonPath Custom Classifiers.

CreateXMLClassifierRequest

createXMLClassifierRequest_rowTag :: Lens' CreateXMLClassifierRequest (Maybe Text) Source #

The XML tag designating the element that contains each record in an XML document being parsed. This can't identify a self-closing element (closed by />). An empty row element that contains only attributes can be parsed as long as it ends with a closing tag (for example, <row item_a="A" item_b="B"></row> is okay, but <row item_a="A" item_b="B" /> is not).

createXMLClassifierRequest_classification :: Lens' CreateXMLClassifierRequest Text Source #

An identifier of the data format that the classifier matches.

CsvClassifier

csvClassifier_allowSingleColumn :: Lens' CsvClassifier (Maybe Bool) Source #

Enables the processing of files that contain only one column.

csvClassifier_containsHeader :: Lens' CsvClassifier (Maybe CsvHeaderOption) Source #

Indicates whether the CSV file contains a header.

csvClassifier_creationTime :: Lens' CsvClassifier (Maybe UTCTime) Source #

The time that this classifier was registered.

csvClassifier_customDatatypeConfigured :: Lens' CsvClassifier (Maybe Bool) Source #

Enables the custom datatype to be configured.

csvClassifier_customDatatypes :: Lens' CsvClassifier (Maybe [Text]) Source #

A list of custom datatypes including "BINARY", "BOOLEAN", "DATE", "DECIMAL", "DOUBLE", "FLOAT", "INT", "LONG", "SHORT", "STRING", "TIMESTAMP".

csvClassifier_delimiter :: Lens' CsvClassifier (Maybe Text) Source #

A custom symbol to denote what separates each column entry in the row.

csvClassifier_disableValueTrimming :: Lens' CsvClassifier (Maybe Bool) Source #

Specifies not to trim values before identifying the type of column values. The default value is true.

csvClassifier_header :: Lens' CsvClassifier (Maybe [Text]) Source #

A list of strings representing column names.

csvClassifier_lastUpdated :: Lens' CsvClassifier (Maybe UTCTime) Source #

The time that this classifier was last updated.

csvClassifier_quoteSymbol :: Lens' CsvClassifier (Maybe Text) Source #

A custom symbol to denote what combines content into a single column value. It must be different from the column delimiter.

csvClassifier_version :: Lens' CsvClassifier (Maybe Integer) Source #

The version of this classifier.

csvClassifier_name :: Lens' CsvClassifier Text Source #

The name of the classifier.

CustomCode

customCode_outputSchemas :: Lens' CustomCode (Maybe [GlueSchema]) Source #

Specifies the data schema for the custom code transform.

customCode_name :: Lens' CustomCode Text Source #

The name of the transform node.

customCode_inputs :: Lens' CustomCode (NonEmpty Text) Source #

The data inputs identified by their node names.

customCode_code :: Lens' CustomCode Text Source #

The custom code that is used to perform the data transformation.

customCode_className :: Lens' CustomCode Text Source #

The name defined for the custom code node class.

CustomEntityType

customEntityType_contextWords :: Lens' CustomEntityType (Maybe (NonEmpty Text)) Source #

A list of context words. If none of these context words are found within the vicinity of the regular expression the data will not be detected as sensitive data.

If no context words are passed only a regular expression is checked.

customEntityType_name :: Lens' CustomEntityType Text Source #

A name for the custom pattern that allows it to be retrieved or deleted later. This name must be unique per Amazon Web Services account.

customEntityType_regexString :: Lens' CustomEntityType Text Source #

A regular expression string that is used for detecting sensitive data in a custom pattern.

DQResultsPublishingOptions

DQStopJobOnFailureOptions

dQStopJobOnFailureOptions_stopJobOnFailureTiming :: Lens' DQStopJobOnFailureOptions (Maybe DQStopJobOnFailureTiming) Source #

When to stop job if your data quality evaluation fails. Options are Immediate or AfterDataLoad.

DataCatalogEncryptionSettings

dataCatalogEncryptionSettings_connectionPasswordEncryption :: Lens' DataCatalogEncryptionSettings (Maybe ConnectionPasswordEncryption) Source #

When connection password protection is enabled, the Data Catalog uses a customer-provided key to encrypt the password as part of CreateConnection or UpdateConnection and store it in the ENCRYPTED_PASSWORD field in the connection properties. You can enable catalog encryption or only password encryption.

dataCatalogEncryptionSettings_encryptionAtRest :: Lens' DataCatalogEncryptionSettings (Maybe EncryptionAtRest) Source #

Specifies the encryption-at-rest configuration for the Data Catalog.

DataLakePrincipal

dataLakePrincipal_dataLakePrincipalIdentifier :: Lens' DataLakePrincipal (Maybe Text) Source #

An identifier for the Lake Formation principal.

DataQualityEvaluationRunAdditionalRunOptions

DataQualityResult

dataQualityResult_completedOn :: Lens' DataQualityResult (Maybe UTCTime) Source #

The date and time when this data quality run completed.

dataQualityResult_dataSource :: Lens' DataQualityResult (Maybe DataSource) Source #

The table associated with the data quality result, if any.

dataQualityResult_evaluationContext :: Lens' DataQualityResult (Maybe Text) Source #

In the context of a job in Glue Studio, each node in the canvas is typically assigned some sort of name and data quality nodes will have names. In the case of multiple nodes, the evaluationContext can differentiate the nodes.

dataQualityResult_jobName :: Lens' DataQualityResult (Maybe Text) Source #

The job name associated with the data quality result, if any.

dataQualityResult_jobRunId :: Lens' DataQualityResult (Maybe Text) Source #

The job run ID associated with the data quality result, if any.

dataQualityResult_resultId :: Lens' DataQualityResult (Maybe Text) Source #

A unique result ID for the data quality result.

dataQualityResult_ruleResults :: Lens' DataQualityResult (Maybe (NonEmpty DataQualityRuleResult)) Source #

A list of DataQualityRuleResult objects representing the results for each rule.

dataQualityResult_rulesetEvaluationRunId :: Lens' DataQualityResult (Maybe Text) Source #

The unique run ID for the ruleset evaluation for this data quality result.

dataQualityResult_rulesetName :: Lens' DataQualityResult (Maybe Text) Source #

The name of the ruleset associated with the data quality result.

dataQualityResult_score :: Lens' DataQualityResult (Maybe Double) Source #

An aggregate data quality score. Represents the ratio of rules that passed to the total number of rules.

dataQualityResult_startedOn :: Lens' DataQualityResult (Maybe UTCTime) Source #

The date and time when this data quality run started.

DataQualityResultDescription

dataQualityResultDescription_dataSource :: Lens' DataQualityResultDescription (Maybe DataSource) Source #

The table name associated with the data quality result.

dataQualityResultDescription_jobName :: Lens' DataQualityResultDescription (Maybe Text) Source #

The job name associated with the data quality result.

dataQualityResultDescription_jobRunId :: Lens' DataQualityResultDescription (Maybe Text) Source #

The job run ID associated with the data quality result.

dataQualityResultDescription_resultId :: Lens' DataQualityResultDescription (Maybe Text) Source #

The unique result ID for this data quality result.

dataQualityResultDescription_startedOn :: Lens' DataQualityResultDescription (Maybe UTCTime) Source #

The time that the run started for this data quality result.

DataQualityResultFilterCriteria

dataQualityResultFilterCriteria_dataSource :: Lens' DataQualityResultFilterCriteria (Maybe DataSource) Source #

Filter results by the specified data source. For example, retrieving all results for an Glue table.

DataQualityRuleRecommendationRunDescription

DataQualityRuleRecommendationRunFilter

DataQualityRuleResult

dataQualityRuleResult_description :: Lens' DataQualityRuleResult (Maybe Text) Source #

A description of the data quality rule.

DataQualityRulesetEvaluationRunDescription

DataQualityRulesetEvaluationRunFilter

dataQualityRulesetEvaluationRunFilter_dataSource :: Lens' DataQualityRulesetEvaluationRunFilter DataSource Source #

Filter based on a data source (an Glue table) associated with the run.

DataQualityRulesetFilterCriteria

DataQualityRulesetListDetails

dataQualityRulesetListDetails_createdOn :: Lens' DataQualityRulesetListDetails (Maybe UTCTime) Source #

The date and time the data quality ruleset was created.

dataQualityRulesetListDetails_lastModifiedOn :: Lens' DataQualityRulesetListDetails (Maybe UTCTime) Source #

The date and time the data quality ruleset was last modified.

dataQualityRulesetListDetails_recommendationRunId :: Lens' DataQualityRulesetListDetails (Maybe Text) Source #

When a ruleset was created from a recommendation run, this run ID is generated to link the two together.

DataQualityTargetTable

dataQualityTargetTable_databaseName :: Lens' DataQualityTargetTable Text Source #

The name of the database where the Glue table exists.

DataSource

Database

database_catalogId :: Lens' Database (Maybe Text) Source #

The ID of the Data Catalog in which the database resides.

database_createTableDefaultPermissions :: Lens' Database (Maybe [PrincipalPermissions]) Source #

Creates a set of default permissions on the table for principals.

database_createTime :: Lens' Database (Maybe UTCTime) Source #

The time at which the metadata database was created in the catalog.

database_description :: Lens' Database (Maybe Text) Source #

A description of the database.

database_locationUri :: Lens' Database (Maybe Text) Source #

The location of the database (for example, an HDFS path).

database_parameters :: Lens' Database (Maybe (HashMap Text Text)) Source #

These key-value pairs define parameters and properties of the database.

database_targetDatabase :: Lens' Database (Maybe DatabaseIdentifier) Source #

A DatabaseIdentifier structure that describes a target database for resource linking.

database_name :: Lens' Database Text Source #

The name of the database. For Hive compatibility, this is folded to lowercase when it is stored.

DatabaseIdentifier

databaseIdentifier_catalogId :: Lens' DatabaseIdentifier (Maybe Text) Source #

The ID of the Data Catalog in which the database resides.

DatabaseInput

databaseInput_createTableDefaultPermissions :: Lens' DatabaseInput (Maybe [PrincipalPermissions]) Source #

Creates a set of default permissions on the table for principals.

databaseInput_description :: Lens' DatabaseInput (Maybe Text) Source #

A description of the database.

databaseInput_locationUri :: Lens' DatabaseInput (Maybe Text) Source #

The location of the database (for example, an HDFS path).

databaseInput_parameters :: Lens' DatabaseInput (Maybe (HashMap Text Text)) Source #

These key-value pairs define parameters and properties of the database.

These key-value pairs define parameters and properties of the database.

databaseInput_targetDatabase :: Lens' DatabaseInput (Maybe DatabaseIdentifier) Source #

A DatabaseIdentifier structure that describes a target database for resource linking.

databaseInput_name :: Lens' DatabaseInput Text Source #

The name of the database. For Hive compatibility, this is folded to lowercase when it is stored.

Datatype

datatype_id :: Lens' Datatype Text Source #

The datatype of the value.

datatype_label :: Lens' Datatype Text Source #

A label assigned to the datatype.

DateColumnStatisticsData

DecimalColumnStatisticsData

DecimalNumber

decimalNumber_unscaledValue :: Lens' DecimalNumber ByteString Source #

The unscaled numeric value.-- -- Note: This Lens automatically encodes and decodes Base64 data. -- The underlying isomorphism will encode to Base64 representation during -- serialisation, and decode from Base64 representation during deserialisation. -- This Lens accepts and returns only raw unencoded data.

decimalNumber_scale :: Lens' DecimalNumber Int Source #

The scale that determines where the decimal point falls in the unscaled value.

DeltaTarget

deltaTarget_connectionName :: Lens' DeltaTarget (Maybe Text) Source #

The name of the connection to use to connect to the Delta table target.

deltaTarget_createNativeDeltaTable :: Lens' DeltaTarget (Maybe Bool) Source #

Specifies whether the crawler will create native tables, to allow integration with query engines that support querying of the Delta transaction log directly.

deltaTarget_deltaTables :: Lens' DeltaTarget (Maybe [Text]) Source #

A list of the Amazon S3 paths to the Delta tables.

deltaTarget_writeManifest :: Lens' DeltaTarget (Maybe Bool) Source #

Specifies whether to write the manifest files to the Delta table path.

DevEndpoint

devEndpoint_arguments :: Lens' DevEndpoint (Maybe (HashMap Text Text)) Source #

A map of arguments used to configure the DevEndpoint.

Valid arguments are:

  • "--enable-glue-datacatalog": ""

You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2.

devEndpoint_availabilityZone :: Lens' DevEndpoint (Maybe Text) Source #

The Amazon Web Services Availability Zone where this DevEndpoint is located.

devEndpoint_createdTimestamp :: Lens' DevEndpoint (Maybe UTCTime) Source #

The point in time at which this DevEndpoint was created.

devEndpoint_endpointName :: Lens' DevEndpoint (Maybe Text) Source #

The name of the DevEndpoint.

devEndpoint_extraJarsS3Path :: Lens' DevEndpoint (Maybe Text) Source #

The path to one or more Java .jar files in an S3 bucket that should be loaded in your DevEndpoint.

You can only use pure Java/Scala libraries with a DevEndpoint.

devEndpoint_extraPythonLibsS3Path :: Lens' DevEndpoint (Maybe Text) Source #

The paths to one or more Python libraries in an Amazon S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma.

You can only use pure Python libraries with a DevEndpoint. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not currently supported.

devEndpoint_failureReason :: Lens' DevEndpoint (Maybe Text) Source #

The reason for a current failure in this DevEndpoint.

devEndpoint_glueVersion :: Lens' DevEndpoint (Maybe Text) Source #

Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for running your ETL scripts on development endpoints.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

Development endpoints that are created without specifying a Glue version default to Glue 0.9.

You can specify a version of Python support for development endpoints by using the Arguments parameter in the CreateDevEndpoint or UpdateDevEndpoint APIs. If no arguments are provided, the version defaults to Python 2.

devEndpoint_lastModifiedTimestamp :: Lens' DevEndpoint (Maybe UTCTime) Source #

The point in time at which this DevEndpoint was last modified.

devEndpoint_lastUpdateStatus :: Lens' DevEndpoint (Maybe Text) Source #

The status of the last update.

devEndpoint_numberOfNodes :: Lens' DevEndpoint (Maybe Int) Source #

The number of Glue Data Processing Units (DPUs) allocated to this DevEndpoint.

devEndpoint_numberOfWorkers :: Lens' DevEndpoint (Maybe Int) Source #

The number of workers of a defined workerType that are allocated to the development endpoint.

The maximum number of workers you can define are 299 for G.1X, and 149 for G.2X.

devEndpoint_privateAddress :: Lens' DevEndpoint (Maybe Text) Source #

A private IP address to access the DevEndpoint within a VPC if the DevEndpoint is created within one. The PrivateAddress field is present only when you create the DevEndpoint within your VPC.

devEndpoint_publicAddress :: Lens' DevEndpoint (Maybe Text) Source #

The public IP address used by this DevEndpoint. The PublicAddress field is present only when you create a non-virtual private cloud (VPC) DevEndpoint.

devEndpoint_publicKey :: Lens' DevEndpoint (Maybe Text) Source #

The public key to be used by this DevEndpoint for authentication. This attribute is provided for backward compatibility because the recommended attribute to use is public keys.

devEndpoint_publicKeys :: Lens' DevEndpoint (Maybe [Text]) Source #

A list of public keys to be used by the DevEndpoints for authentication. Using this attribute is preferred over a single public key because the public keys allow you to have a different private key per client.

If you previously created an endpoint with a public key, you must remove that key to be able to set a list of public keys. Call the UpdateDevEndpoint API operation with the public key content in the deletePublicKeys attribute, and the list of new keys in the addPublicKeys attribute.

devEndpoint_roleArn :: Lens' DevEndpoint (Maybe Text) Source #

The Amazon Resource Name (ARN) of the IAM role used in this DevEndpoint.

devEndpoint_securityConfiguration :: Lens' DevEndpoint (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with this DevEndpoint.

devEndpoint_securityGroupIds :: Lens' DevEndpoint (Maybe [Text]) Source #

A list of security group identifiers used in this DevEndpoint.

devEndpoint_status :: Lens' DevEndpoint (Maybe Text) Source #

The current status of this DevEndpoint.

devEndpoint_subnetId :: Lens' DevEndpoint (Maybe Text) Source #

The subnet ID for this DevEndpoint.

devEndpoint_vpcId :: Lens' DevEndpoint (Maybe Text) Source #

The ID of the virtual private cloud (VPC) used by this DevEndpoint.

devEndpoint_workerType :: Lens' DevEndpoint (Maybe WorkerType) Source #

The type of predefined worker that is allocated to the development endpoint. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.

Known issue: when a development endpoint is created with the G.2X WorkerType configuration, the Spark drivers for the development endpoint will run on 4 vCPU, 16 GB of memory, and a 64 GB disk.

devEndpoint_yarnEndpointAddress :: Lens' DevEndpoint (Maybe Text) Source #

The YARN endpoint address used by this DevEndpoint.

devEndpoint_zeppelinRemoteSparkInterpreterPort :: Lens' DevEndpoint (Maybe Int) Source #

The Apache Zeppelin port for the remote Apache Spark interpreter.

DevEndpointCustomLibraries

devEndpointCustomLibraries_extraJarsS3Path :: Lens' DevEndpointCustomLibraries (Maybe Text) Source #

The path to one or more Java .jar files in an S3 bucket that should be loaded in your DevEndpoint.

You can only use pure Java/Scala libraries with a DevEndpoint.

devEndpointCustomLibraries_extraPythonLibsS3Path :: Lens' DevEndpointCustomLibraries (Maybe Text) Source #

The paths to one or more Python libraries in an Amazon Simple Storage Service (Amazon S3) bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma.

You can only use pure Python libraries with a DevEndpoint. Libraries that rely on C extensions, such as the pandas Python data analysis library, are not currently supported.

DirectKafkaSource

directKafkaSource_dataPreviewOptions :: Lens' DirectKafkaSource (Maybe StreamingDataPreviewOptions) Source #

Specifies options related to data preview for viewing a sample of your data.

directKafkaSource_detectSchema :: Lens' DirectKafkaSource (Maybe Bool) Source #

Whether to automatically determine the schema from the incoming data.

directKafkaSource_windowSize :: Lens' DirectKafkaSource (Maybe Natural) Source #

The amount of time to spend processing each micro batch.

DirectKinesisSource

directKinesisSource_detectSchema :: Lens' DirectKinesisSource (Maybe Bool) Source #

Whether to automatically determine the schema from the incoming data.

directKinesisSource_streamingOptions :: Lens' DirectKinesisSource (Maybe KinesisStreamingSourceOptions) Source #

Additional options for the Kinesis streaming data source.

directKinesisSource_windowSize :: Lens' DirectKinesisSource (Maybe Natural) Source #

The amount of time to spend processing each micro batch.

DirectSchemaChangePolicy

directSchemaChangePolicy_database :: Lens' DirectSchemaChangePolicy (Maybe Text) Source #

Specifies the database that the schema change policy applies to.

directSchemaChangePolicy_enableUpdateCatalog :: Lens' DirectSchemaChangePolicy (Maybe Bool) Source #

Whether to use the specified update behavior when the crawler finds a changed schema.

directSchemaChangePolicy_table :: Lens' DirectSchemaChangePolicy (Maybe Text) Source #

Specifies the table in the database that the schema change policy applies to.

directSchemaChangePolicy_updateBehavior :: Lens' DirectSchemaChangePolicy (Maybe UpdateCatalogBehavior) Source #

The update behavior when the crawler finds a changed schema.

DoubleColumnStatisticsData

DropDuplicates

dropDuplicates_columns :: Lens' DropDuplicates (Maybe [[Text]]) Source #

The name of the columns to be merged or removed if repeating.

dropDuplicates_name :: Lens' DropDuplicates Text Source #

The name of the transform node.

dropDuplicates_inputs :: Lens' DropDuplicates (NonEmpty Text) Source #

The data inputs identified by their node names.

DropFields

dropFields_name :: Lens' DropFields Text Source #

The name of the transform node.

dropFields_inputs :: Lens' DropFields (NonEmpty Text) Source #

The data inputs identified by their node names.

dropFields_paths :: Lens' DropFields [[Text]] Source #

A JSON path to a variable in the data structure.

DropNullFields

dropNullFields_nullCheckBoxList :: Lens' DropNullFields (Maybe NullCheckBoxList) Source #

A structure that represents whether certain values are recognized as null values for removal.

dropNullFields_nullTextList :: Lens' DropNullFields (Maybe [NullValueField]) Source #

A structure that specifies a list of NullValueField structures that represent a custom null value such as zero or other value being used as a null placeholder unique to the dataset.

The DropNullFields transform removes custom null values only if both the value of the null placeholder and the datatype match the data.

dropNullFields_name :: Lens' DropNullFields Text Source #

The name of the transform node.

dropNullFields_inputs :: Lens' DropNullFields (NonEmpty Text) Source #

The data inputs identified by their node names.

DynamicTransform

dynamicTransform_parameters :: Lens' DynamicTransform (Maybe [TransformConfigParameter]) Source #

Specifies the parameters of the dynamic transform.

dynamicTransform_version :: Lens' DynamicTransform (Maybe Text) Source #

This field is not used and will be deprecated in future release.

dynamicTransform_name :: Lens' DynamicTransform Text Source #

Specifies the name of the dynamic transform.

dynamicTransform_transformName :: Lens' DynamicTransform Text Source #

Specifies the name of the dynamic transform as it appears in the Glue Studio visual editor.

dynamicTransform_inputs :: Lens' DynamicTransform (NonEmpty Text) Source #

Specifies the inputs for the dynamic transform that are required.

dynamicTransform_functionName :: Lens' DynamicTransform Text Source #

Specifies the name of the function of the dynamic transform.

dynamicTransform_path :: Lens' DynamicTransform Text Source #

Specifies the path of the dynamic transform source and config files.

DynamoDBCatalogSource

dynamoDBCatalogSource_database :: Lens' DynamoDBCatalogSource Text Source #

The name of the database to read from.

dynamoDBCatalogSource_table :: Lens' DynamoDBCatalogSource Text Source #

The name of the table in the database to read from.

DynamoDBTarget

dynamoDBTarget_path :: Lens' DynamoDBTarget (Maybe Text) Source #

The name of the DynamoDB table to crawl.

dynamoDBTarget_scanAll :: Lens' DynamoDBTarget (Maybe Bool) Source #

Indicates whether to scan all the records, or to sample rows from the table. Scanning all the records can take a long time when the table is not a high throughput table.

A value of true means to scan all records, while a value of false means to sample the records. If no value is specified, the value defaults to true.

dynamoDBTarget_scanRate :: Lens' DynamoDBTarget (Maybe Double) Source #

The percentage of the configured read capacity units to use by the Glue crawler. Read capacity units is a term defined by DynamoDB, and is a numeric value that acts as rate limiter for the number of reads that can be performed on that table per second.

The valid values are null or a value between 0.1 to 1.5. A null value is used when user does not provide a value, and defaults to 0.5 of the configured Read Capacity Unit (for provisioned tables), or 0.25 of the max configured Read Capacity Unit (for tables using on-demand mode).

Edge

edge_destinationId :: Lens' Edge (Maybe Text) Source #

The unique of the node within the workflow where the edge ends.

edge_sourceId :: Lens' Edge (Maybe Text) Source #

The unique of the node within the workflow where the edge starts.

EncryptionAtRest

encryptionAtRest_sseAwsKmsKeyId :: Lens' EncryptionAtRest (Maybe Text) Source #

The ID of the KMS key to use for encryption at rest.

encryptionAtRest_catalogEncryptionMode :: Lens' EncryptionAtRest CatalogEncryptionMode Source #

The encryption-at-rest mode for encrypting Data Catalog data.

EncryptionConfiguration

encryptionConfiguration_s3Encryption :: Lens' EncryptionConfiguration (Maybe [S3Encryption]) Source #

The encryption configuration for Amazon Simple Storage Service (Amazon S3) data.

ErrorDetail

errorDetail_errorCode :: Lens' ErrorDetail (Maybe Text) Source #

The code associated with this error.

errorDetail_errorMessage :: Lens' ErrorDetail (Maybe Text) Source #

A message describing the error.

ErrorDetails

errorDetails_errorCode :: Lens' ErrorDetails (Maybe Text) Source #

The error code for an error.

errorDetails_errorMessage :: Lens' ErrorDetails (Maybe Text) Source #

The error message for an error.

EvaluateDataQuality

evaluateDataQuality_output :: Lens' EvaluateDataQuality (Maybe DQTransformOutput) Source #

The output of your data quality evaluation.

evaluateDataQuality_stopJobOnFailureOptions :: Lens' EvaluateDataQuality (Maybe DQStopJobOnFailureOptions) Source #

Options to configure how your job will stop if your data quality evaluation fails.

evaluateDataQuality_name :: Lens' EvaluateDataQuality Text Source #

The name of the data quality evaluation.

evaluateDataQuality_inputs :: Lens' EvaluateDataQuality (NonEmpty Text) Source #

The inputs of your data quality evaluation.

evaluateDataQuality_ruleset :: Lens' EvaluateDataQuality Text Source #

The ruleset for your data quality evaluation.

EvaluationMetrics

evaluationMetrics_findMatchesMetrics :: Lens' EvaluationMetrics (Maybe FindMatchesMetrics) Source #

The evaluation metrics for the find matches algorithm.

EventBatchingCondition

eventBatchingCondition_batchWindow :: Lens' EventBatchingCondition (Maybe Natural) Source #

Window of time in seconds after which EventBridge event trigger fires. Window starts when first event is received.

eventBatchingCondition_batchSize :: Lens' EventBatchingCondition Natural Source #

Number of events that must be received from Amazon EventBridge before EventBridge event trigger fires.

ExecutionProperty

executionProperty_maxConcurrentRuns :: Lens' ExecutionProperty (Maybe Int) Source #

The maximum number of concurrent runs allowed for the job. The default is 1. An error is returned when this threshold is reached. The maximum value you can specify is controlled by a service limit.

ExportLabelsTaskRunProperties

exportLabelsTaskRunProperties_outputS3Path :: Lens' ExportLabelsTaskRunProperties (Maybe Text) Source #

The Amazon Simple Storage Service (Amazon S3) path where you will export the labels.

FillMissingValues

fillMissingValues_filledPath :: Lens' FillMissingValues (Maybe Text) Source #

A JSON path to a variable in the data structure for the dataset that is filled.

fillMissingValues_name :: Lens' FillMissingValues Text Source #

The name of the transform node.

fillMissingValues_inputs :: Lens' FillMissingValues (NonEmpty Text) Source #

The data inputs identified by their node names.

fillMissingValues_imputedPath :: Lens' FillMissingValues Text Source #

A JSON path to a variable in the data structure for the dataset that is imputed.

Filter

filter_name :: Lens' Filter Text Source #

The name of the transform node.

filter_inputs :: Lens' Filter (NonEmpty Text) Source #

The data inputs identified by their node names.

filter_logicalOperator :: Lens' Filter FilterLogicalOperator Source #

The operator used to filter rows by comparing the key value to a specified value.

filter_filters :: Lens' Filter [FilterExpression] Source #

Specifies a filter expression.

FilterExpression

filterExpression_negated :: Lens' FilterExpression (Maybe Bool) Source #

Whether the expression is to be negated.

filterExpression_operation :: Lens' FilterExpression FilterOperation Source #

The type of operation to perform in the expression.

FilterValue

filterValue_value :: Lens' FilterValue [Text] Source #

The value to be associated.

FindMatchesMetrics

findMatchesMetrics_areaUnderPRCurve :: Lens' FindMatchesMetrics (Maybe Double) Source #

The area under the precision/recall curve (AUPRC) is a single number measuring the overall quality of the transform, that is independent of the choice made for precision vs. recall. Higher values indicate that you have a more attractive precision vs. recall tradeoff.

For more information, see Precision and recall in Wikipedia.

findMatchesMetrics_columnImportances :: Lens' FindMatchesMetrics (Maybe [ColumnImportance]) Source #

A list of ColumnImportance structures containing column importance metrics, sorted in order of descending importance.

findMatchesMetrics_confusionMatrix :: Lens' FindMatchesMetrics (Maybe ConfusionMatrix) Source #

The confusion matrix shows you what your transform is predicting accurately and what types of errors it is making.

For more information, see Confusion matrix in Wikipedia.

findMatchesMetrics_f1 :: Lens' FindMatchesMetrics (Maybe Double) Source #

The maximum F1 metric indicates the transform's accuracy between 0 and 1, where 1 is the best accuracy.

For more information, see F1 score in Wikipedia.

findMatchesMetrics_precision :: Lens' FindMatchesMetrics (Maybe Double) Source #

The precision metric indicates when often your transform is correct when it predicts a match. Specifically, it measures how well the transform finds true positives from the total true positives possible.

For more information, see Precision and recall in Wikipedia.

findMatchesMetrics_recall :: Lens' FindMatchesMetrics (Maybe Double) Source #

The recall metric indicates that for an actual match, how often your transform predicts the match. Specifically, it measures how well the transform finds true positives from the total records in the source data.

For more information, see Precision and recall in Wikipedia.

FindMatchesParameters

findMatchesParameters_accuracyCostTradeoff :: Lens' FindMatchesParameters (Maybe Double) Source #

The value that is selected when tuning your transform for a balance between accuracy and cost. A value of 0.5 means that the system balances accuracy and cost concerns. A value of 1.0 means a bias purely for accuracy, which typically results in a higher cost, sometimes substantially higher. A value of 0.0 means a bias purely for cost, which results in a less accurate FindMatches transform, sometimes with unacceptable accuracy.

Accuracy measures how well the transform finds true positives and true negatives. Increasing accuracy requires more machine resources and cost. But it also results in increased recall.

Cost measures how many compute resources, and thus money, are consumed to run the transform.

findMatchesParameters_enforceProvidedLabels :: Lens' FindMatchesParameters (Maybe Bool) Source #

The value to switch on or off to force the output to match the provided labels from users. If the value is True, the find matches transform forces the output to match the provided labels. The results override the normal conflation results. If the value is False, the find matches transform does not ensure all the labels provided are respected, and the results rely on the trained model.

Note that setting this value to true may increase the conflation execution time.

findMatchesParameters_precisionRecallTradeoff :: Lens' FindMatchesParameters (Maybe Double) Source #

The value selected when tuning your transform for a balance between precision and recall. A value of 0.5 means no preference; a value of 1.0 means a bias purely for precision, and a value of 0.0 means a bias for recall. Because this is a tradeoff, choosing values close to 1.0 means very low recall, and choosing values close to 0.0 results in very low precision.

The precision metric indicates how often your model is correct when it predicts a match.

The recall metric indicates that for an actual match, how often your model predicts the match.

findMatchesParameters_primaryKeyColumnName :: Lens' FindMatchesParameters (Maybe Text) Source #

The name of a column that uniquely identifies rows in the source table. Used to help identify matching records.

FindMatchesTaskRunProperties

findMatchesTaskRunProperties_jobName :: Lens' FindMatchesTaskRunProperties (Maybe Text) Source #

The name assigned to the job for the Find Matches task run.

GetConnectionsFilter

getConnectionsFilter_connectionType :: Lens' GetConnectionsFilter (Maybe ConnectionType) Source #

The type of connections to return. Currently, SFTP is not supported.

getConnectionsFilter_matchCriteria :: Lens' GetConnectionsFilter (Maybe [Text]) Source #

A criteria string that must match the criteria recorded in the connection definition for that connection definition to be returned.

GluePolicy

gluePolicy_createTime :: Lens' GluePolicy (Maybe UTCTime) Source #

The date and time at which the policy was created.

gluePolicy_policyHash :: Lens' GluePolicy (Maybe Text) Source #

Contains the hash value associated with this policy.

gluePolicy_policyInJson :: Lens' GluePolicy (Maybe Text) Source #

Contains the requested policy document, in JSON format.

gluePolicy_updateTime :: Lens' GluePolicy (Maybe UTCTime) Source #

The date and time at which the policy was last updated.

GlueSchema

glueSchema_columns :: Lens' GlueSchema (Maybe [GlueStudioSchemaColumn]) Source #

Specifies the column definitions that make up a Glue schema.

GlueStudioSchemaColumn

glueStudioSchemaColumn_type :: Lens' GlueStudioSchemaColumn (Maybe Text) Source #

The hive type for this column in the Glue Studio schema.

glueStudioSchemaColumn_name :: Lens' GlueStudioSchemaColumn Text Source #

The name of the column in the Glue Studio schema.

GlueTable

glueTable_additionalOptions :: Lens' GlueTable (Maybe (HashMap Text Text)) Source #

Additional options for the table. Currently there are two keys supported:

  • pushDownPredicate: to filter on partitions without having to list and read all the files in your dataset.
  • catalogPartitionPredicate: to use server-side partition pruning using partition indexes in the Glue Data Catalog.

glueTable_catalogId :: Lens' GlueTable (Maybe Text) Source #

A unique identifier for the Glue Data Catalog.

glueTable_connectionName :: Lens' GlueTable (Maybe Text) Source #

The name of the connection to the Glue Data Catalog.

glueTable_databaseName :: Lens' GlueTable Text Source #

A database name in the Glue Data Catalog.

glueTable_tableName :: Lens' GlueTable Text Source #

A table name in the Glue Data Catalog.

GovernedCatalogSource

governedCatalogSource_partitionPredicate :: Lens' GovernedCatalogSource (Maybe Text) Source #

Partitions satisfying this predicate are deleted. Files within the retention period in these partitions are not deleted. Set to "" – empty by default.

GovernedCatalogTarget

governedCatalogTarget_partitionKeys :: Lens' GovernedCatalogTarget (Maybe [[Text]]) Source #

Specifies native partitioning using a sequence of keys.

governedCatalogTarget_schemaChangePolicy :: Lens' GovernedCatalogTarget (Maybe CatalogSchemaChangePolicy) Source #

A policy that specifies update behavior for the governed catalog.

governedCatalogTarget_inputs :: Lens' GovernedCatalogTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

governedCatalogTarget_table :: Lens' GovernedCatalogTarget Text Source #

The name of the table in the database to write to.

governedCatalogTarget_database :: Lens' GovernedCatalogTarget Text Source #

The name of the database to write to.

GrokClassifier

grokClassifier_creationTime :: Lens' GrokClassifier (Maybe UTCTime) Source #

The time that this classifier was registered.

grokClassifier_customPatterns :: Lens' GrokClassifier (Maybe Text) Source #

Optional custom grok patterns defined by this classifier. For more information, see custom patterns in Writing Custom Classifiers.

grokClassifier_lastUpdated :: Lens' GrokClassifier (Maybe UTCTime) Source #

The time that this classifier was last updated.

grokClassifier_version :: Lens' GrokClassifier (Maybe Integer) Source #

The version of this classifier.

grokClassifier_name :: Lens' GrokClassifier Text Source #

The name of the classifier.

grokClassifier_classification :: Lens' GrokClassifier Text Source #

An identifier of the data format that the classifier matches, such as Twitter, JSON, Omniture logs, and so on.

grokClassifier_grokPattern :: Lens' GrokClassifier Text Source #

The grok pattern applied to a data store by this classifier. For more information, see built-in patterns in Writing Custom Classifiers.

ImportLabelsTaskRunProperties

importLabelsTaskRunProperties_inputS3Path :: Lens' ImportLabelsTaskRunProperties (Maybe Text) Source #

The Amazon Simple Storage Service (Amazon S3) path from where you will import the labels.

importLabelsTaskRunProperties_replace :: Lens' ImportLabelsTaskRunProperties (Maybe Bool) Source #

Indicates whether to overwrite your existing labels.

JDBCConnectorOptions

jDBCConnectorOptions_dataTypeMapping :: Lens' JDBCConnectorOptions (Maybe (HashMap JDBCDataType GlueRecordType)) Source #

Custom data type mapping that builds a mapping from a JDBC data type to an Glue data type. For example, the option "dataTypeMapping":{"FLOAT":"STRING"} maps data fields of JDBC type FLOAT into the Java String type by calling the ResultSet.getString() method of the driver, and uses it to build the Glue record. The ResultSet object is implemented by each driver, so the behavior is specific to the driver you use. Refer to the documentation for your JDBC driver to understand how the driver performs the conversions.

jDBCConnectorOptions_filterPredicate :: Lens' JDBCConnectorOptions (Maybe Text) Source #

Extra condition clause to filter data from source. For example:

BillingCity='Mountain View'

When using a query instead of a table name, you should validate that the query works with the specified filterPredicate.

jDBCConnectorOptions_jobBookmarkKeys :: Lens' JDBCConnectorOptions (Maybe [Text]) Source #

The name of the job bookmark keys on which to sort.

jDBCConnectorOptions_jobBookmarkKeysSortOrder :: Lens' JDBCConnectorOptions (Maybe Text) Source #

Specifies an ascending or descending sort order.

jDBCConnectorOptions_lowerBound :: Lens' JDBCConnectorOptions (Maybe Natural) Source #

The minimum value of partitionColumn that is used to decide partition stride.

jDBCConnectorOptions_numPartitions :: Lens' JDBCConnectorOptions (Maybe Natural) Source #

The number of partitions. This value, along with lowerBound (inclusive) and upperBound (exclusive), form partition strides for generated WHERE clause expressions that are used to split the partitionColumn.

jDBCConnectorOptions_partitionColumn :: Lens' JDBCConnectorOptions (Maybe Text) Source #

The name of an integer column that is used for partitioning. This option works only when it's included with lowerBound, upperBound, and numPartitions. This option works the same way as in the Spark SQL JDBC reader.

jDBCConnectorOptions_upperBound :: Lens' JDBCConnectorOptions (Maybe Natural) Source #

The maximum value of partitionColumn that is used to decide partition stride.

JDBCConnectorSource

jDBCConnectorSource_connectionTable :: Lens' JDBCConnectorSource (Maybe Text) Source #

The name of the table in the data source.

jDBCConnectorSource_outputSchemas :: Lens' JDBCConnectorSource (Maybe [GlueSchema]) Source #

Specifies the data schema for the custom JDBC source.

jDBCConnectorSource_query :: Lens' JDBCConnectorSource (Maybe Text) Source #

The table or SQL query to get the data from. You can specify either ConnectionTable or query, but not both.

jDBCConnectorSource_connectionName :: Lens' JDBCConnectorSource Text Source #

The name of the connection that is associated with the connector.

jDBCConnectorSource_connectorName :: Lens' JDBCConnectorSource Text Source #

The name of a connector that assists with accessing the data store in Glue Studio.

jDBCConnectorSource_connectionType :: Lens' JDBCConnectorSource Text Source #

The type of connection, such as marketplace.jdbc or custom.jdbc, designating a connection to a JDBC data store.

JDBCConnectorTarget

jDBCConnectorTarget_additionalOptions :: Lens' JDBCConnectorTarget (Maybe (HashMap Text Text)) Source #

Additional connection options for the connector.

jDBCConnectorTarget_outputSchemas :: Lens' JDBCConnectorTarget (Maybe [GlueSchema]) Source #

Specifies the data schema for the JDBC target.

jDBCConnectorTarget_inputs :: Lens' JDBCConnectorTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

jDBCConnectorTarget_connectionName :: Lens' JDBCConnectorTarget Text Source #

The name of the connection that is associated with the connector.

jDBCConnectorTarget_connectionTable :: Lens' JDBCConnectorTarget Text Source #

The name of the table in the data target.

jDBCConnectorTarget_connectorName :: Lens' JDBCConnectorTarget Text Source #

The name of a connector that will be used.

jDBCConnectorTarget_connectionType :: Lens' JDBCConnectorTarget Text Source #

The type of connection, such as marketplace.jdbc or custom.jdbc, designating a connection to a JDBC data target.

JdbcTarget

jdbcTarget_connectionName :: Lens' JdbcTarget (Maybe Text) Source #

The name of the connection to use to connect to the JDBC target.

jdbcTarget_enableAdditionalMetadata :: Lens' JdbcTarget (Maybe [JdbcMetadataEntry]) Source #

Specify a value of RAWTYPES or COMMENTS to enable additional metadata in table responses. RAWTYPES provides the native-level datatype. COMMENTS provides comments associated with a column or table in the database.

If you do not need additional metadata, keep the field empty.

jdbcTarget_exclusions :: Lens' JdbcTarget (Maybe [Text]) Source #

A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.

jdbcTarget_path :: Lens' JdbcTarget (Maybe Text) Source #

The path of the JDBC target.

Job

job_allocatedCapacity :: Lens' Job (Maybe Int) Source #

This field is deprecated. Use MaxCapacity instead.

The number of Glue data processing units (DPUs) allocated to runs of this job. You can allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

job_codeGenConfigurationNodes :: Lens' Job (Maybe (HashMap Text CodeGenConfigurationNode)) Source #

The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.

job_command :: Lens' Job (Maybe JobCommand) Source #

The JobCommand that runs this job.

job_connections :: Lens' Job (Maybe ConnectionsList) Source #

The connections used for this job.

job_createdOn :: Lens' Job (Maybe UTCTime) Source #

The time and date that this job definition was created.

job_defaultArguments :: Lens' Job (Maybe (HashMap Text Text)) Source #

The default arguments for this job, specified as name-value pairs.

You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes.

For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.

For information about the key-value pairs that Glue consumes to set up your job, see the Special Parameters Used by Glue topic in the developer guide.

job_description :: Lens' Job (Maybe Text) Source #

A description of the job.

job_executionClass :: Lens' Job (Maybe ExecutionClass) Source #

Indicates whether the job is run with a standard or flexible execution class. The standard execution class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.

The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.

Only jobs with Glue version 3.0 and above and command type glueetl will be allowed to set ExecutionClass to FLEX. The flexible execution class is available for Spark jobs.

job_executionProperty :: Lens' Job (Maybe ExecutionProperty) Source #

An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.

job_glueVersion :: Lens' Job (Maybe Text) Source #

Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for jobs of type Spark.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

Jobs that are created without specifying a Glue version default to Glue 0.9.

job_lastModifiedOn :: Lens' Job (Maybe UTCTime) Source #

The last point in time when this job definition was modified.

job_logUri :: Lens' Job (Maybe Text) Source #

This field is reserved for future use.

job_maxCapacity :: Lens' Job (Maybe Double) Source #

For Glue version 1.0 or earlier jobs, using the standard worker type, the number of Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

Do not set Max Capacity if using WorkerType and NumberOfWorkers.

The value that can be allocated for MaxCapacity depends on whether you are running a Python shell job, an Apache Spark ETL job, or an Apache Spark streaming ETL job:

  • When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
  • When you specify an Apache Spark ETL job (JobCommand.Name="glueetl") or Apache Spark streaming ETL job (JobCommand.Name="gluestreaming"), you can allocate a minimum of 2 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.

For Glue version 2.0 jobs, you cannot instead specify a Maximum capacity. Instead, you should specify a Worker type and the Number of workers.

job_maxRetries :: Lens' Job (Maybe Int) Source #

The maximum number of times to retry this job after a JobRun fails.

job_name :: Lens' Job (Maybe Text) Source #

The name you assign to this job definition.

job_nonOverridableArguments :: Lens' Job (Maybe (HashMap Text Text)) Source #

Non-overridable arguments for this job, specified as name-value pairs.

job_notificationProperty :: Lens' Job (Maybe NotificationProperty) Source #

Specifies configuration properties of a job notification.

job_numberOfWorkers :: Lens' Job (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when a job runs.

job_role :: Lens' Job (Maybe Text) Source #

The name or Amazon Resource Name (ARN) of the IAM role associated with this job.

job_securityConfiguration :: Lens' Job (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with this job.

job_sourceControlDetails :: Lens' Job (Maybe SourceControlDetails) Source #

The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.

job_timeout :: Lens' Job (Maybe Natural) Source #

The job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

job_workerType :: Lens' Job (Maybe WorkerType) Source #

The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, G.2X, or G.025X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.025X worker type, each worker maps to 0.25 DPU (2 vCPU, 4 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.

JobBookmarkEntry

jobBookmarkEntry_jobName :: Lens' JobBookmarkEntry (Maybe Text) Source #

The name of the job in question.

jobBookmarkEntry_previousRunId :: Lens' JobBookmarkEntry (Maybe Text) Source #

The unique run identifier associated with the previous job run.

JobBookmarksEncryption

jobBookmarksEncryption_kmsKeyArn :: Lens' JobBookmarksEncryption (Maybe Text) Source #

The Amazon Resource Name (ARN) of the KMS key to be used to encrypt the data.

JobCommand

jobCommand_name :: Lens' JobCommand (Maybe Text) Source #

The name of the job command. For an Apache Spark ETL job, this must be glueetl. For a Python shell job, it must be pythonshell. For an Apache Spark streaming ETL job, this must be gluestreaming.

jobCommand_pythonVersion :: Lens' JobCommand (Maybe Text) Source #

The Python version being used to run a Python shell job. Allowed values are 2 or 3.

jobCommand_scriptLocation :: Lens' JobCommand (Maybe Text) Source #

Specifies the Amazon Simple Storage Service (Amazon S3) path to a script that runs a job.

JobNodeDetails

jobNodeDetails_jobRuns :: Lens' JobNodeDetails (Maybe [JobRun]) Source #

The information for the job runs represented by the job node.

JobRun

jobRun_allocatedCapacity :: Lens' JobRun (Maybe Int) Source #

This field is deprecated. Use MaxCapacity instead.

The number of Glue data processing units (DPUs) allocated to this JobRun. From 2 to 100 DPUs can be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

jobRun_arguments :: Lens' JobRun (Maybe (HashMap Text Text)) Source #

The job arguments associated with this run. For this job run, they replace the default arguments set in the job definition itself.

You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes.

For information about how to specify and consume your own job arguments, see the Calling Glue APIs in Python topic in the developer guide.

For information about the key-value pairs that Glue consumes to set up your job, see the Special Parameters Used by Glue topic in the developer guide.

jobRun_attempt :: Lens' JobRun (Maybe Int) Source #

The number of the attempt to run this job.

jobRun_completedOn :: Lens' JobRun (Maybe UTCTime) Source #

The date and time that this job run completed.

jobRun_dPUSeconds :: Lens' JobRun (Maybe Double) Source #

This field populates only for Auto Scaling job runs, and represents the total time each executor ran during the lifecycle of a job run in seconds, multiplied by a DPU factor (1 for G.1X, 2 for G.2X, or 0.25 for G.025X workers). This value may be different than the executionEngineRuntime * MaxCapacity as in the case of Auto Scaling jobs, as the number of executors running at a given time may be less than the MaxCapacity. Therefore, it is possible that the value of DPUSeconds is less than executionEngineRuntime * MaxCapacity.

jobRun_errorMessage :: Lens' JobRun (Maybe Text) Source #

An error message associated with this job run.

jobRun_executionClass :: Lens' JobRun (Maybe ExecutionClass) Source #

Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.

The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.

Only jobs with Glue version 3.0 and above and command type glueetl will be allowed to set ExecutionClass to FLEX. The flexible execution class is available for Spark jobs.

jobRun_executionTime :: Lens' JobRun (Maybe Int) Source #

The amount of time (in seconds) that the job run consumed resources.

jobRun_glueVersion :: Lens' JobRun (Maybe Text) Source #

Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for jobs of type Spark.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

Jobs that are created without specifying a Glue version default to Glue 0.9.

jobRun_id :: Lens' JobRun (Maybe Text) Source #

The ID of this job run.

jobRun_jobName :: Lens' JobRun (Maybe Text) Source #

The name of the job definition being used in this run.

jobRun_jobRunState :: Lens' JobRun (Maybe JobRunState) Source #

The current state of the job run. For more information about the statuses of jobs that have terminated abnormally, see Glue Job Run Statuses.

jobRun_lastModifiedOn :: Lens' JobRun (Maybe UTCTime) Source #

The last time that this job run was modified.

jobRun_logGroupName :: Lens' JobRun (Maybe Text) Source #

The name of the log group for secure logging that can be server-side encrypted in Amazon CloudWatch using KMS. This name can be /aws-glue/jobs/, in which case the default encryption is NONE. If you add a role name and SecurityConfiguration name (in other words, /aws-glue/jobs-yourRoleName-yourSecurityConfigurationName/), then that security configuration is used to encrypt the log group.

jobRun_maxCapacity :: Lens' JobRun (Maybe Double) Source #

The number of Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

Do not set Max Capacity if using WorkerType and NumberOfWorkers.

The value that can be allocated for MaxCapacity depends on whether you are running a Python shell job or an Apache Spark ETL job:

  • When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
  • When you specify an Apache Spark ETL job (JobCommand.Name="glueetl"), you can allocate a minimum of 2 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.

jobRun_notificationProperty :: Lens' JobRun (Maybe NotificationProperty) Source #

Specifies configuration properties of a job run notification.

jobRun_numberOfWorkers :: Lens' JobRun (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when a job runs.

jobRun_predecessorRuns :: Lens' JobRun (Maybe [Predecessor]) Source #

A list of predecessors to this job run.

jobRun_previousRunId :: Lens' JobRun (Maybe Text) Source #

The ID of the previous run of this job. For example, the JobRunId specified in the StartJobRun action.

jobRun_securityConfiguration :: Lens' JobRun (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with this job run.

jobRun_startedOn :: Lens' JobRun (Maybe UTCTime) Source #

The date and time at which this job run was started.

jobRun_timeout :: Lens' JobRun (Maybe Natural) Source #

The JobRun timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. This value overrides the timeout value set in the parent job.

Streaming jobs do not have a timeout. The default for non-streaming jobs is 2,880 minutes (48 hours).

jobRun_triggerName :: Lens' JobRun (Maybe Text) Source #

The name of the trigger that started this job run.

jobRun_workerType :: Lens' JobRun (Maybe WorkerType) Source #

The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, G.2X, or G.025X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.
  • For the G.025X worker type, each worker maps to 0.25 DPU (2 vCPU, 4 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.

JobUpdate

jobUpdate_allocatedCapacity :: Lens' JobUpdate (Maybe Int) Source #

This field is deprecated. Use MaxCapacity instead.

The number of Glue data processing units (DPUs) to allocate to this job. You can allocate a minimum of 2 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

jobUpdate_codeGenConfigurationNodes :: Lens' JobUpdate (Maybe (HashMap Text CodeGenConfigurationNode)) Source #

The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based.

jobUpdate_command :: Lens' JobUpdate (Maybe JobCommand) Source #

The JobCommand that runs this job (required).

jobUpdate_connections :: Lens' JobUpdate (Maybe ConnectionsList) Source #

The connections used for this job.

jobUpdate_defaultArguments :: Lens' JobUpdate (Maybe (HashMap Text Text)) Source #

The default arguments for this job.

You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes.

For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.

For information about the key-value pairs that Glue consumes to set up your job, see the Special Parameters Used by Glue topic in the developer guide.

jobUpdate_description :: Lens' JobUpdate (Maybe Text) Source #

Description of the job being defined.

jobUpdate_executionClass :: Lens' JobUpdate (Maybe ExecutionClass) Source #

Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources.

The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary.

Only jobs with Glue version 3.0 and above and command type glueetl will be allowed to set ExecutionClass to FLEX. The flexible execution class is available for Spark jobs.

jobUpdate_executionProperty :: Lens' JobUpdate (Maybe ExecutionProperty) Source #

An ExecutionProperty specifying the maximum number of concurrent runs allowed for this job.

jobUpdate_glueVersion :: Lens' JobUpdate (Maybe Text) Source #

Glue version determines the versions of Apache Spark and Python that Glue supports. The Python version indicates the version supported for jobs of type Spark.

For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide.

jobUpdate_logUri :: Lens' JobUpdate (Maybe Text) Source #

This field is reserved for future use.

jobUpdate_maxCapacity :: Lens' JobUpdate (Maybe Double) Source #

For Glue version 1.0 or earlier jobs, using the standard worker type, the number of Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

Do not set Max Capacity if using WorkerType and NumberOfWorkers.

The value that can be allocated for MaxCapacity depends on whether you are running a Python shell job or an Apache Spark ETL job:

  • When you specify a Python shell job (JobCommand.Name="pythonshell"), you can allocate either 0.0625 or 1 DPU. The default is 0.0625 DPU.
  • When you specify an Apache Spark ETL job (JobCommand.Name="glueetl") or Apache Spark streaming ETL job (JobCommand.Name="gluestreaming"), you can allocate a minimum of 2 DPUs. The default is 10 DPUs. This job type cannot have a fractional DPU allocation.

For Glue version 2.0 jobs, you cannot instead specify a Maximum capacity. Instead, you should specify a Worker type and the Number of workers.

jobUpdate_maxRetries :: Lens' JobUpdate (Maybe Int) Source #

The maximum number of times to retry this job if it fails.

jobUpdate_nonOverridableArguments :: Lens' JobUpdate (Maybe (HashMap Text Text)) Source #

Non-overridable arguments for this job, specified as name-value pairs.

jobUpdate_notificationProperty :: Lens' JobUpdate (Maybe NotificationProperty) Source #

Specifies the configuration properties of a job notification.

jobUpdate_numberOfWorkers :: Lens' JobUpdate (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when a job runs.

jobUpdate_role :: Lens' JobUpdate (Maybe Text) Source #

The name or Amazon Resource Name (ARN) of the IAM role associated with this job (required).

jobUpdate_securityConfiguration :: Lens' JobUpdate (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with this job.

jobUpdate_sourceControlDetails :: Lens' JobUpdate (Maybe SourceControlDetails) Source #

The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository.

jobUpdate_timeout :: Lens' JobUpdate (Maybe Natural) Source #

The job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

jobUpdate_workerType :: Lens' JobUpdate (Maybe WorkerType) Source #

The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, G.2X, or G.025X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker maps to 1 DPU (4 vCPU, 16 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.2X worker type, each worker maps to 2 DPU (8 vCPU, 32 GB of memory, 128 GB disk), and provides 1 executor per worker. We recommend this worker type for memory-intensive jobs.
  • For the G.025X worker type, each worker maps to 0.25 DPU (2 vCPU, 4 GB of memory, 64 GB disk), and provides 1 executor per worker. We recommend this worker type for low volume streaming jobs. This worker type is only available for Glue version 3.0 streaming jobs.

Join

join_name :: Lens' Join Text Source #

The name of the transform node.

join_inputs :: Lens' Join (NonEmpty Text) Source #

The data inputs identified by their node names.

join_joinType :: Lens' Join JoinType Source #

Specifies the type of join to be performed on the datasets.

join_columns :: Lens' Join (NonEmpty JoinColumn) Source #

A list of the two columns to be joined.

JoinColumn

joinColumn_from :: Lens' JoinColumn Text Source #

The column to be joined.

joinColumn_keys :: Lens' JoinColumn [[Text]] Source #

The key of the column to be joined.

JsonClassifier

jsonClassifier_creationTime :: Lens' JsonClassifier (Maybe UTCTime) Source #

The time that this classifier was registered.

jsonClassifier_lastUpdated :: Lens' JsonClassifier (Maybe UTCTime) Source #

The time that this classifier was last updated.

jsonClassifier_version :: Lens' JsonClassifier (Maybe Integer) Source #

The version of this classifier.

jsonClassifier_name :: Lens' JsonClassifier Text Source #

The name of the classifier.

jsonClassifier_jsonPath :: Lens' JsonClassifier Text Source #

A JsonPath string defining the JSON data for the classifier to classify. Glue supports a subset of JsonPath, as described in Writing JsonPath Custom Classifiers.

KafkaStreamingSourceOptions

kafkaStreamingSourceOptions_assign :: Lens' KafkaStreamingSourceOptions (Maybe Text) Source #

The specific TopicPartitions to consume. You must specify at least one of "topicName", "assign" or "subscribePattern".

kafkaStreamingSourceOptions_bootstrapServers :: Lens' KafkaStreamingSourceOptions (Maybe Text) Source #

A list of bootstrap server URLs, for example, as b-1.vpc-test-2.o4q88o.c6.kafka.us-east-1.amazonaws.com:9094. This option must be specified in the API call or defined in the table metadata in the Data Catalog.

kafkaStreamingSourceOptions_endingOffsets :: Lens' KafkaStreamingSourceOptions (Maybe Text) Source #

The end point when a batch query is ended. Possible values are either "latest" or a JSON string that specifies an ending offset for each TopicPartition.

kafkaStreamingSourceOptions_maxOffsetsPerTrigger :: Lens' KafkaStreamingSourceOptions (Maybe Natural) Source #

The rate limit on the maximum number of offsets that are processed per trigger interval. The specified total number of offsets is proportionally split across topicPartitions of different volumes. The default value is null, which means that the consumer reads all offsets until the known latest offset.

kafkaStreamingSourceOptions_minPartitions :: Lens' KafkaStreamingSourceOptions (Maybe Natural) Source #

The desired minimum number of partitions to read from Kafka. The default value is null, which means that the number of spark partitions is equal to the number of Kafka partitions.

kafkaStreamingSourceOptions_numRetries :: Lens' KafkaStreamingSourceOptions (Maybe Natural) Source #

The number of times to retry before failing to fetch Kafka offsets. The default value is 3.

kafkaStreamingSourceOptions_pollTimeoutMs :: Lens' KafkaStreamingSourceOptions (Maybe Natural) Source #

The timeout in milliseconds to poll data from Kafka in Spark job executors. The default value is 512.

kafkaStreamingSourceOptions_retryIntervalMs :: Lens' KafkaStreamingSourceOptions (Maybe Natural) Source #

The time in milliseconds to wait before retrying to fetch Kafka offsets. The default value is 10.

kafkaStreamingSourceOptions_securityProtocol :: Lens' KafkaStreamingSourceOptions (Maybe Text) Source #

The protocol used to communicate with brokers. The possible values are "SSL" or "PLAINTEXT".

kafkaStreamingSourceOptions_startingOffsets :: Lens' KafkaStreamingSourceOptions (Maybe Text) Source #

The starting position in the Kafka topic to read data from. The possible values are "earliest" or "latest". The default value is "latest".

kafkaStreamingSourceOptions_subscribePattern :: Lens' KafkaStreamingSourceOptions (Maybe Text) Source #

A Java regex string that identifies the topic list to subscribe to. You must specify at least one of "topicName", "assign" or "subscribePattern".

kafkaStreamingSourceOptions_topicName :: Lens' KafkaStreamingSourceOptions (Maybe Text) Source #

The topic name as specified in Apache Kafka. You must specify at least one of "topicName", "assign" or "subscribePattern".

KeySchemaElement

keySchemaElement_name :: Lens' KeySchemaElement Text Source #

The name of a partition key.

keySchemaElement_type :: Lens' KeySchemaElement Text Source #

The type of a partition key.

KinesisStreamingSourceOptions

kinesisStreamingSourceOptions_addIdleTimeBetweenReads :: Lens' KinesisStreamingSourceOptions (Maybe Bool) Source #

Adds a time delay between two consecutive getRecords operations. The default value is "False". This option is only configurable for Glue version 2.0 and above.

kinesisStreamingSourceOptions_avoidEmptyBatches :: Lens' KinesisStreamingSourceOptions (Maybe Bool) Source #

Avoids creating an empty microbatch job by checking for unread data in the Kinesis data stream before the batch is started. The default value is "False".

kinesisStreamingSourceOptions_describeShardInterval :: Lens' KinesisStreamingSourceOptions (Maybe Natural) Source #

The minimum time interval between two ListShards API calls for your script to consider resharding. The default value is 1s.

kinesisStreamingSourceOptions_idleTimeBetweenReadsInMs :: Lens' KinesisStreamingSourceOptions (Maybe Natural) Source #

The minimum time delay between two consecutive getRecords operations, specified in ms. The default value is 1000. This option is only configurable for Glue version 2.0 and above.

kinesisStreamingSourceOptions_maxFetchRecordsPerShard :: Lens' KinesisStreamingSourceOptions (Maybe Natural) Source #

The maximum number of records to fetch per shard in the Kinesis data stream. The default value is 100000.

kinesisStreamingSourceOptions_maxFetchTimeInMs :: Lens' KinesisStreamingSourceOptions (Maybe Natural) Source #

The maximum time spent in the job executor to fetch a record from the Kinesis data stream per shard, specified in milliseconds (ms). The default value is 1000.

kinesisStreamingSourceOptions_maxRecordPerRead :: Lens' KinesisStreamingSourceOptions (Maybe Natural) Source #

The maximum number of records to fetch from the Kinesis data stream in each getRecords operation. The default value is 10000.

kinesisStreamingSourceOptions_maxRetryIntervalMs :: Lens' KinesisStreamingSourceOptions (Maybe Natural) Source #

The maximum cool-off time period (specified in ms) between two retries of a Kinesis Data Streams API call. The default value is 10000.

kinesisStreamingSourceOptions_numRetries :: Lens' KinesisStreamingSourceOptions (Maybe Natural) Source #

The maximum number of retries for Kinesis Data Streams API requests. The default value is 3.

kinesisStreamingSourceOptions_retryIntervalMs :: Lens' KinesisStreamingSourceOptions (Maybe Natural) Source #

The cool-off time period (specified in ms) before retrying the Kinesis Data Streams API call. The default value is 1000.

kinesisStreamingSourceOptions_roleArn :: Lens' KinesisStreamingSourceOptions (Maybe Text) Source #

The Amazon Resource Name (ARN) of the role to assume using AWS Security Token Service (AWS STS). This role must have permissions for describe or read record operations for the Kinesis data stream. You must use this parameter when accessing a data stream in a different account. Used in conjunction with "awsSTSSessionName".

kinesisStreamingSourceOptions_roleSessionName :: Lens' KinesisStreamingSourceOptions (Maybe Text) Source #

An identifier for the session assuming the role using AWS STS. You must use this parameter when accessing a data stream in a different account. Used in conjunction with "awsSTSRoleARN".

kinesisStreamingSourceOptions_startingPosition :: Lens' KinesisStreamingSourceOptions (Maybe StartingPosition) Source #

The starting position in the Kinesis data stream to read data from. The possible values are "latest", "trim_horizon", or "earliest". The default value is "latest".

kinesisStreamingSourceOptions_streamArn :: Lens' KinesisStreamingSourceOptions (Maybe Text) Source #

The Amazon Resource Name (ARN) of the Kinesis data stream.

LabelingSetGenerationTaskRunProperties

labelingSetGenerationTaskRunProperties_outputS3Path :: Lens' LabelingSetGenerationTaskRunProperties (Maybe Text) Source #

The Amazon Simple Storage Service (Amazon S3) path where you will generate the labeling set.

LakeFormationConfiguration

lakeFormationConfiguration_accountId :: Lens' LakeFormationConfiguration (Maybe Text) Source #

Required for cross account crawls. For same account crawls as the target data, this can be left as null.

lakeFormationConfiguration_useLakeFormationCredentials :: Lens' LakeFormationConfiguration (Maybe Bool) Source #

Specifies whether to use Lake Formation credentials for the crawler instead of the IAM role credentials.

LastActiveDefinition

lastActiveDefinition_blueprintLocation :: Lens' LastActiveDefinition (Maybe Text) Source #

Specifies a path in Amazon S3 where the blueprint is published by the Glue developer.

lastActiveDefinition_blueprintServiceLocation :: Lens' LastActiveDefinition (Maybe Text) Source #

Specifies a path in Amazon S3 where the blueprint is copied when you create or update the blueprint.

lastActiveDefinition_lastModifiedOn :: Lens' LastActiveDefinition (Maybe UTCTime) Source #

The date and time the blueprint was last modified.

lastActiveDefinition_parameterSpec :: Lens' LastActiveDefinition (Maybe Text) Source #

A JSON string specifying the parameters for the blueprint.

LastCrawlInfo

lastCrawlInfo_errorMessage :: Lens' LastCrawlInfo (Maybe Text) Source #

If an error occurred, the error information about the last crawl.

lastCrawlInfo_logGroup :: Lens' LastCrawlInfo (Maybe Text) Source #

The log group for the last crawl.

lastCrawlInfo_logStream :: Lens' LastCrawlInfo (Maybe Text) Source #

The log stream for the last crawl.

lastCrawlInfo_messagePrefix :: Lens' LastCrawlInfo (Maybe Text) Source #

The prefix for a message about this crawl.

lastCrawlInfo_startTime :: Lens' LastCrawlInfo (Maybe UTCTime) Source #

The time at which the crawl started.

LineageConfiguration

lineageConfiguration_crawlerLineageSettings :: Lens' LineageConfiguration (Maybe CrawlerLineageSettings) Source #

Specifies whether data lineage is enabled for the crawler. Valid values are:

  • ENABLE: enables data lineage for the crawler
  • DISABLE: disables data lineage for the crawler

Location

location_dynamoDB :: Lens' Location (Maybe [CodeGenNodeArg]) Source #

An Amazon DynamoDB table location.

location_s3 :: Lens' Location (Maybe [CodeGenNodeArg]) Source #

An Amazon Simple Storage Service (Amazon S3) location.

LongColumnStatisticsData

MLTransform

mLTransform_createdOn :: Lens' MLTransform (Maybe UTCTime) Source #

A timestamp. The time and date that this machine learning transform was created.

mLTransform_description :: Lens' MLTransform (Maybe Text) Source #

A user-defined, long-form description text for the machine learning transform. Descriptions are not guaranteed to be unique and can be changed at any time.

mLTransform_evaluationMetrics :: Lens' MLTransform (Maybe EvaluationMetrics) Source #

An EvaluationMetrics object. Evaluation metrics provide an estimate of the quality of your machine learning transform.

mLTransform_glueVersion :: Lens' MLTransform (Maybe Text) Source #

This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.

mLTransform_inputRecordTables :: Lens' MLTransform (Maybe [GlueTable]) Source #

A list of Glue table definitions used by the transform.

mLTransform_labelCount :: Lens' MLTransform (Maybe Int) Source #

A count identifier for the labeling files generated by Glue for this transform. As you create a better transform, you can iteratively download, label, and upload the labeling file.

mLTransform_lastModifiedOn :: Lens' MLTransform (Maybe UTCTime) Source #

A timestamp. The last point in time when this machine learning transform was modified.

mLTransform_maxCapacity :: Lens' MLTransform (Maybe Double) Source #

The number of Glue data processing units (DPUs) that are allocated to task runs for this transform. You can allocate from 2 to 100 DPUs; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
  • MaxCapacity and NumberOfWorkers must both be at least 1.

When the WorkerType field is set to a value other than Standard, the MaxCapacity field is set automatically and becomes read-only.

mLTransform_maxRetries :: Lens' MLTransform (Maybe Int) Source #

The maximum number of times to retry after an MLTaskRun of the machine learning transform fails.

mLTransform_name :: Lens' MLTransform (Maybe Text) Source #

A user-defined name for the machine learning transform. Names are not guaranteed unique and can be changed at any time.

mLTransform_numberOfWorkers :: Lens' MLTransform (Maybe Int) Source #

The number of workers of a defined workerType that are allocated when a task of the transform runs.

If WorkerType is set, then NumberOfWorkers is required (and vice versa).

mLTransform_parameters :: Lens' MLTransform (Maybe TransformParameters) Source #

A TransformParameters object. You can use parameters to tune (customize) the behavior of the machine learning transform by specifying what data it learns from and your preference on various tradeoffs (such as precious vs. recall, or accuracy vs. cost).

mLTransform_role :: Lens' MLTransform (Maybe Text) Source #

The name or Amazon Resource Name (ARN) of the IAM role with the required permissions. The required permissions include both Glue service role permissions to Glue resources, and Amazon S3 permissions required by the transform.

  • This role needs Glue service role permissions to allow access to resources in Glue. See Attach a Policy to IAM Users That Access Glue.
  • This role needs permission to your Amazon Simple Storage Service (Amazon S3) sources, targets, temporary directory, scripts, and any libraries used by the task run for this transform.

mLTransform_schema :: Lens' MLTransform (Maybe [SchemaColumn]) Source #

A map of key-value pairs representing the columns and data types that this transform can run against. Has an upper bound of 100 columns.

mLTransform_status :: Lens' MLTransform (Maybe TransformStatusType) Source #

The current status of the machine learning transform.

mLTransform_timeout :: Lens' MLTransform (Maybe Natural) Source #

The timeout in minutes of the machine learning transform.

mLTransform_transformEncryption :: Lens' MLTransform (Maybe TransformEncryption) Source #

The encryption-at-rest settings of the transform that apply to accessing user data. Machine learning transforms can access user data encrypted in Amazon S3 using KMS.

mLTransform_transformId :: Lens' MLTransform (Maybe Text) Source #

The unique transform ID that is generated for the machine learning transform. The ID is guaranteed to be unique and does not change.

mLTransform_workerType :: Lens' MLTransform (Maybe WorkerType) Source #

The type of predefined worker that is allocated when a task of this transform runs. Accepts a value of Standard, G.1X, or G.2X.

  • For the Standard worker type, each worker provides 4 vCPU, 16 GB of memory and a 50GB disk, and 2 executors per worker.
  • For the G.1X worker type, each worker provides 4 vCPU, 16 GB of memory and a 64GB disk, and 1 executor per worker.
  • For the G.2X worker type, each worker provides 8 vCPU, 32 GB of memory and a 128GB disk, and 1 executor per worker.

MaxCapacity is a mutually exclusive option with NumberOfWorkers and WorkerType.

  • If either NumberOfWorkers or WorkerType is set, then MaxCapacity cannot be set.
  • If MaxCapacity is set then neither NumberOfWorkers or WorkerType can be set.
  • If WorkerType is set, then NumberOfWorkers is required (and vice versa).
  • MaxCapacity and NumberOfWorkers must both be at least 1.

MLUserDataEncryption

mLUserDataEncryption_kmsKeyId :: Lens' MLUserDataEncryption (Maybe Text) Source #

The ID for the customer-provided KMS key.

mLUserDataEncryption_mlUserDataEncryptionMode :: Lens' MLUserDataEncryption MLUserDataEncryptionModeString Source #

The encryption mode applied to user data. Valid values are:

  • DISABLED: encryption is disabled
  • SSEKMS: use of server-side encryption with Key Management Service (SSE-KMS) for user data stored in Amazon S3.

Mapping

mapping_children :: Lens' Mapping (Maybe [Mapping]) Source #

Only applicable to nested data structures. If you want to change the parent structure, but also one of its children, you can fill out this data strucutre. It is also Mapping, but its FromPath will be the parent's FromPath plus the FromPath from this structure.

For the children part, suppose you have the structure:

{ "FromPath": "OuterStructure", "ToKey": "OuterStructure", "ToType": "Struct", "Dropped": false, "Chidlren": [{ "FromPath": "inner", "ToKey": "inner", "ToType": "Double", "Dropped": false, }] }

You can specify a Mapping that looks like:

{ "FromPath": "OuterStructure", "ToKey": "OuterStructure", "ToType": "Struct", "Dropped": false, "Chidlren": [{ "FromPath": "inner", "ToKey": "inner", "ToType": "Double", "Dropped": false, }] }

mapping_dropped :: Lens' Mapping (Maybe Bool) Source #

If true, then the column is removed.

mapping_fromPath :: Lens' Mapping (Maybe [Text]) Source #

The table or column to be modified.

mapping_fromType :: Lens' Mapping (Maybe Text) Source #

The type of the data to be modified.

mapping_toKey :: Lens' Mapping (Maybe Text) Source #

After the apply mapping, what the name of the column should be. Can be the same as FromPath.

mapping_toType :: Lens' Mapping (Maybe Text) Source #

The data type that the data is to be modified to.

MappingEntry

mappingEntry_sourceTable :: Lens' MappingEntry (Maybe Text) Source #

The name of the source table.

Merge

merge_name :: Lens' Merge Text Source #

The name of the transform node.

merge_inputs :: Lens' Merge (NonEmpty Text) Source #

The data inputs identified by their node names.

merge_source :: Lens' Merge Text Source #

The source DynamicFrame that will be merged with a staging DynamicFrame.

merge_primaryKeys :: Lens' Merge [[Text]] Source #

The list of primary key fields to match records from the source and staging dynamic frames.

MetadataInfo

metadataInfo_createdTime :: Lens' MetadataInfo (Maybe Text) Source #

The time at which the entry was created.

metadataInfo_metadataValue :: Lens' MetadataInfo (Maybe Text) Source #

The metadata key’s corresponding value.

metadataInfo_otherMetadataValueList :: Lens' MetadataInfo (Maybe [OtherMetadataValueListItem]) Source #

Other metadata belonging to the same metadata key.

MetadataKeyValuePair

metadataKeyValuePair_metadataValue :: Lens' MetadataKeyValuePair (Maybe Text) Source #

A metadata key’s corresponding value.

MicrosoftSQLServerCatalogSource

microsoftSQLServerCatalogSource_table :: Lens' MicrosoftSQLServerCatalogSource Text Source #

The name of the table in the database to read from.

MicrosoftSQLServerCatalogTarget

microsoftSQLServerCatalogTarget_table :: Lens' MicrosoftSQLServerCatalogTarget Text Source #

The name of the table in the database to write to.

MongoDBTarget

mongoDBTarget_connectionName :: Lens' MongoDBTarget (Maybe Text) Source #

The name of the connection to use to connect to the Amazon DocumentDB or MongoDB target.

mongoDBTarget_path :: Lens' MongoDBTarget (Maybe Text) Source #

The path of the Amazon DocumentDB or MongoDB target (database/collection).

mongoDBTarget_scanAll :: Lens' MongoDBTarget (Maybe Bool) Source #

Indicates whether to scan all the records, or to sample rows from the table. Scanning all the records can take a long time when the table is not a high throughput table.

A value of true means to scan all records, while a value of false means to sample the records. If no value is specified, the value defaults to true.

MySQLCatalogSource

mySQLCatalogSource_database :: Lens' MySQLCatalogSource Text Source #

The name of the database to read from.

mySQLCatalogSource_table :: Lens' MySQLCatalogSource Text Source #

The name of the table in the database to read from.

MySQLCatalogTarget

mySQLCatalogTarget_inputs :: Lens' MySQLCatalogTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

mySQLCatalogTarget_database :: Lens' MySQLCatalogTarget Text Source #

The name of the database to write to.

mySQLCatalogTarget_table :: Lens' MySQLCatalogTarget Text Source #

The name of the table in the database to write to.

Node

node_crawlerDetails :: Lens' Node (Maybe CrawlerNodeDetails) Source #

Details of the crawler when the node represents a crawler.

node_jobDetails :: Lens' Node (Maybe JobNodeDetails) Source #

Details of the Job when the node represents a Job.

node_name :: Lens' Node (Maybe Text) Source #

The name of the Glue component represented by the node.

node_triggerDetails :: Lens' Node (Maybe TriggerNodeDetails) Source #

Details of the Trigger when the node represents a Trigger.

node_type :: Lens' Node (Maybe NodeType) Source #

The type of Glue component represented by the node.

node_uniqueId :: Lens' Node (Maybe Text) Source #

The unique Id assigned to the node within the workflow.

NotificationProperty

notificationProperty_notifyDelayAfter :: Lens' NotificationProperty (Maybe Natural) Source #

After a job run starts, the number of minutes to wait before sending a job run delay notification.

NullCheckBoxList

nullCheckBoxList_isEmpty :: Lens' NullCheckBoxList (Maybe Bool) Source #

Specifies that an empty string is considered as a null value.

nullCheckBoxList_isNegOne :: Lens' NullCheckBoxList (Maybe Bool) Source #

Specifies that an integer value of -1 is considered as a null value.

nullCheckBoxList_isNullString :: Lens' NullCheckBoxList (Maybe Bool) Source #

Specifies that a value spelling out the word 'null' is considered as a null value.

NullValueField

nullValueField_value :: Lens' NullValueField Text Source #

The value of the null placeholder.

OracleSQLCatalogSource

oracleSQLCatalogSource_database :: Lens' OracleSQLCatalogSource Text Source #

The name of the database to read from.

oracleSQLCatalogSource_table :: Lens' OracleSQLCatalogSource Text Source #

The name of the table in the database to read from.

OracleSQLCatalogTarget

oracleSQLCatalogTarget_inputs :: Lens' OracleSQLCatalogTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

oracleSQLCatalogTarget_table :: Lens' OracleSQLCatalogTarget Text Source #

The name of the table in the database to write to.

Order

order_column :: Lens' Order Text Source #

The name of the column.

order_sortOrder :: Lens' Order Natural Source #

Indicates that the column is sorted in ascending order (== 1), or in descending order (==0).

OtherMetadataValueListItem

otherMetadataValueListItem_metadataValue :: Lens' OtherMetadataValueListItem (Maybe Text) Source #

The metadata key’s corresponding value for the other metadata belonging to the same metadata key.

PIIDetection

pIIDetection_maskValue :: Lens' PIIDetection (Maybe Text) Source #

Indicates the value that will replace the detected entity.

pIIDetection_outputColumnName :: Lens' PIIDetection (Maybe Text) Source #

Indicates the output column name that will contain any entity type detected in that row.

pIIDetection_sampleFraction :: Lens' PIIDetection (Maybe Double) Source #

Indicates the fraction of the data to sample when scanning for PII entities.

pIIDetection_thresholdFraction :: Lens' PIIDetection (Maybe Double) Source #

Indicates the fraction of the data that must be met in order for a column to be identified as PII data.

pIIDetection_name :: Lens' PIIDetection Text Source #

The name of the transform node.

pIIDetection_inputs :: Lens' PIIDetection (NonEmpty Text) Source #

The node ID inputs to the transform.

pIIDetection_piiType :: Lens' PIIDetection PiiType Source #

Indicates the type of PIIDetection transform.

pIIDetection_entityTypesToDetect :: Lens' PIIDetection [Text] Source #

Indicates the types of entities the PIIDetection transform will identify as PII data.

PII type entities include: PERSON_NAME, DATE, USA_SNN, EMAIL, USA_ITIN, USA_PASSPORT_NUMBER, PHONE_NUMBER, BANK_ACCOUNT, IP_ADDRESS, MAC_ADDRESS, USA_CPT_CODE, USA_HCPCS_CODE, USA_NATIONAL_DRUG_CODE, USA_MEDICARE_BENEFICIARY_IDENTIFIER, USA_HEALTH_INSURANCE_CLAIM_NUMBER,CREDIT_CARD,USA_NATIONAL_PROVIDER_IDENTIFIER,USA_DEA_NUMBER,USA_DRIVING_LICENSE

Partition

partition_catalogId :: Lens' Partition (Maybe Text) Source #

The ID of the Data Catalog in which the partition resides.

partition_creationTime :: Lens' Partition (Maybe UTCTime) Source #

The time at which the partition was created.

partition_databaseName :: Lens' Partition (Maybe Text) Source #

The name of the catalog database in which to create the partition.

partition_lastAccessTime :: Lens' Partition (Maybe UTCTime) Source #

The last time at which the partition was accessed.

partition_lastAnalyzedTime :: Lens' Partition (Maybe UTCTime) Source #

The last time at which column statistics were computed for this partition.

partition_parameters :: Lens' Partition (Maybe (HashMap Text Text)) Source #

These key-value pairs define partition parameters.

partition_storageDescriptor :: Lens' Partition (Maybe StorageDescriptor) Source #

Provides information about the physical location where the partition is stored.

partition_tableName :: Lens' Partition (Maybe Text) Source #

The name of the database table in which to create the partition.

partition_values :: Lens' Partition (Maybe [Text]) Source #

The values of the partition.

PartitionError

partitionError_errorDetail :: Lens' PartitionError (Maybe ErrorDetail) Source #

The details about the partition error.

partitionError_partitionValues :: Lens' PartitionError (Maybe [Text]) Source #

The values that define the partition.

PartitionIndex

partitionIndex_keys :: Lens' PartitionIndex (NonEmpty Text) Source #

The keys for the partition index.

partitionIndex_indexName :: Lens' PartitionIndex Text Source #

The name of the partition index.

PartitionIndexDescriptor

partitionIndexDescriptor_backfillErrors :: Lens' PartitionIndexDescriptor (Maybe [BackfillError]) Source #

A list of errors that can occur when registering partition indexes for an existing table.

partitionIndexDescriptor_keys :: Lens' PartitionIndexDescriptor (NonEmpty KeySchemaElement) Source #

A list of one or more keys, as KeySchemaElement structures, for the partition index.

partitionIndexDescriptor_indexStatus :: Lens' PartitionIndexDescriptor PartitionIndexStatus Source #

The status of the partition index.

The possible statuses are:

  • CREATING: The index is being created. When an index is in a CREATING state, the index or its table cannot be deleted.
  • ACTIVE: The index creation succeeds.
  • FAILED: The index creation fails.
  • DELETING: The index is deleted from the list of indexes.

PartitionInput

partitionInput_lastAccessTime :: Lens' PartitionInput (Maybe UTCTime) Source #

The last time at which the partition was accessed.

partitionInput_lastAnalyzedTime :: Lens' PartitionInput (Maybe UTCTime) Source #

The last time at which column statistics were computed for this partition.

partitionInput_parameters :: Lens' PartitionInput (Maybe (HashMap Text Text)) Source #

These key-value pairs define partition parameters.

partitionInput_storageDescriptor :: Lens' PartitionInput (Maybe StorageDescriptor) Source #

Provides information about the physical location where the partition is stored.

partitionInput_values :: Lens' PartitionInput (Maybe [Text]) Source #

The values of the partition. Although this parameter is not required by the SDK, you must specify this parameter for a valid input.

The values for the keys for the new partition must be passed as an array of String objects that must be ordered in the same order as the partition keys appearing in the Amazon S3 prefix. Otherwise Glue will add the values to the wrong keys.

PartitionValueList

PhysicalConnectionRequirements

physicalConnectionRequirements_availabilityZone :: Lens' PhysicalConnectionRequirements (Maybe Text) Source #

The connection's Availability Zone. This field is redundant because the specified subnet implies the Availability Zone to be used. Currently the field must be populated, but it will be deprecated in the future.

PostgreSQLCatalogSource

postgreSQLCatalogSource_table :: Lens' PostgreSQLCatalogSource Text Source #

The name of the table in the database to read from.

PostgreSQLCatalogTarget

postgreSQLCatalogTarget_inputs :: Lens' PostgreSQLCatalogTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

postgreSQLCatalogTarget_table :: Lens' PostgreSQLCatalogTarget Text Source #

The name of the table in the database to write to.

Predecessor

predecessor_jobName :: Lens' Predecessor (Maybe Text) Source #

The name of the job definition used by the predecessor job run.

predecessor_runId :: Lens' Predecessor (Maybe Text) Source #

The job-run ID of the predecessor job run.

Predicate

predicate_conditions :: Lens' Predicate (Maybe [Condition]) Source #

A list of the conditions that determine when the trigger will fire.

predicate_logical :: Lens' Predicate (Maybe Logical) Source #

An optional field if only one condition is listed. If multiple conditions are listed, then this field is required.

PrincipalPermissions

principalPermissions_permissions :: Lens' PrincipalPermissions (Maybe [Permission]) Source #

The permissions that are granted to the principal.

PropertyPredicate

propertyPredicate_comparator :: Lens' PropertyPredicate (Maybe Comparator) Source #

The comparator used to compare this property to others.

RecrawlPolicy

recrawlPolicy_recrawlBehavior :: Lens' RecrawlPolicy (Maybe RecrawlBehavior) Source #

Specifies whether to crawl the entire dataset again or to crawl only folders that were added since the last crawler run.

A value of CRAWL_EVERYTHING specifies crawling the entire dataset again.

A value of CRAWL_NEW_FOLDERS_ONLY specifies crawling only folders that were added since the last crawler run.

A value of CRAWL_EVENT_MODE specifies crawling only the changes identified by Amazon S3 events.

RedshiftSource

redshiftSource_redshiftTmpDir :: Lens' RedshiftSource (Maybe Text) Source #

The Amazon S3 path where temporary data can be staged when copying out of the database.

redshiftSource_name :: Lens' RedshiftSource Text Source #

The name of the Amazon Redshift data store.

redshiftSource_table :: Lens' RedshiftSource Text Source #

The database table to read from.

RedshiftTarget

redshiftTarget_redshiftTmpDir :: Lens' RedshiftTarget (Maybe Text) Source #

The Amazon S3 path where temporary data can be staged when copying out of the database.

redshiftTarget_upsertRedshiftOptions :: Lens' RedshiftTarget (Maybe UpsertRedshiftTargetOptions) Source #

The set of options to configure an upsert operation when writing to a Redshift target.

redshiftTarget_name :: Lens' RedshiftTarget Text Source #

The name of the data target.

redshiftTarget_inputs :: Lens' RedshiftTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

redshiftTarget_database :: Lens' RedshiftTarget Text Source #

The name of the database to write to.

redshiftTarget_table :: Lens' RedshiftTarget Text Source #

The name of the table in the database to write to.

RegistryId

registryId_registryArn :: Lens' RegistryId (Maybe Text) Source #

Arn of the registry to be updated. One of RegistryArn or RegistryName has to be provided.

registryId_registryName :: Lens' RegistryId (Maybe Text) Source #

Name of the registry. Used only for lookup. One of RegistryArn or RegistryName has to be provided.

RegistryListItem

registryListItem_createdTime :: Lens' RegistryListItem (Maybe Text) Source #

The data the registry was created.

registryListItem_registryArn :: Lens' RegistryListItem (Maybe Text) Source #

The Amazon Resource Name (ARN) of the registry.

registryListItem_updatedTime :: Lens' RegistryListItem (Maybe Text) Source #

The date the registry was updated.

RelationalCatalogSource

relationalCatalogSource_table :: Lens' RelationalCatalogSource Text Source #

The name of the table in the database to read from.

RenameField

renameField_name :: Lens' RenameField Text Source #

The name of the transform node.

renameField_inputs :: Lens' RenameField (NonEmpty Text) Source #

The data inputs identified by their node names.

renameField_sourcePath :: Lens' RenameField [Text] Source #

A JSON path to a variable in the data structure for the source data.

renameField_targetPath :: Lens' RenameField [Text] Source #

A JSON path to a variable in the data structure for the target data.

ResourceUri

resourceUri_uri :: Lens' ResourceUri (Maybe Text) Source #

The URI for accessing the resource.

S3CatalogSource

s3CatalogSource_partitionPredicate :: Lens' S3CatalogSource (Maybe Text) Source #

Partitions satisfying this predicate are deleted. Files within the retention period in these partitions are not deleted. Set to "" – empty by default.

s3CatalogSource_name :: Lens' S3CatalogSource Text Source #

The name of the data store.

s3CatalogSource_table :: Lens' S3CatalogSource Text Source #

The database table to read from.

S3CatalogTarget

s3CatalogTarget_partitionKeys :: Lens' S3CatalogTarget (Maybe [[Text]]) Source #

Specifies native partitioning using a sequence of keys.

s3CatalogTarget_schemaChangePolicy :: Lens' S3CatalogTarget (Maybe CatalogSchemaChangePolicy) Source #

A policy that specifies update behavior for the crawler.

s3CatalogTarget_name :: Lens' S3CatalogTarget Text Source #

The name of the data target.

s3CatalogTarget_inputs :: Lens' S3CatalogTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

s3CatalogTarget_table :: Lens' S3CatalogTarget Text Source #

The name of the table in the database to write to.

s3CatalogTarget_database :: Lens' S3CatalogTarget Text Source #

The name of the database to write to.

S3CsvSource

s3CsvSource_compressionType :: Lens' S3CsvSource (Maybe CompressionType) Source #

Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").

s3CsvSource_escaper :: Lens' S3CsvSource (Maybe Text) Source #

Specifies a character to use for escaping. This option is used only when reading CSV files. The default value is none. If enabled, the character which immediately follows is used as-is, except for a small set of well-known escapes (\n, \r, \t, and \0).

s3CsvSource_exclusions :: Lens' S3CsvSource (Maybe [Text]) Source #

A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.

s3CsvSource_groupFiles :: Lens' S3CsvSource (Maybe Text) Source #

Grouping files is turned on by default when the input contains more than 50,000 files. To turn on grouping with fewer than 50,000 files, set this parameter to "inPartition". To disable grouping when there are more than 50,000 files, set this parameter to "none".

s3CsvSource_groupSize :: Lens' S3CsvSource (Maybe Text) Source #

The target group size in bytes. The default is computed based on the input data size and the size of your cluster. When there are fewer than 50,000 input files, "groupFiles" must be set to "inPartition" for this to take effect.

s3CsvSource_maxBand :: Lens' S3CsvSource (Maybe Natural) Source #

This option controls the duration in milliseconds after which the s3 listing is likely to be consistent. Files with modification timestamps falling within the last maxBand milliseconds are tracked specially when using JobBookmarks to account for Amazon S3 eventual consistency. Most users don't need to set this option. The default is 900000 milliseconds, or 15 minutes.

s3CsvSource_maxFilesInBand :: Lens' S3CsvSource (Maybe Natural) Source #

This option specifies the maximum number of files to save from the last maxBand seconds. If this number is exceeded, extra files are skipped and only processed in the next job run.

s3CsvSource_multiline :: Lens' S3CsvSource (Maybe Bool) Source #

A Boolean value that specifies whether a single record can span multiple lines. This can occur when a field contains a quoted new-line character. You must set this option to True if any record spans multiple lines. The default value is False, which allows for more aggressive file-splitting during parsing.

s3CsvSource_optimizePerformance :: Lens' S3CsvSource (Maybe Bool) Source #

A Boolean value that specifies whether to use the advanced SIMD CSV reader along with Apache Arrow based columnar memory formats. Only available in Glue version 3.0.

s3CsvSource_outputSchemas :: Lens' S3CsvSource (Maybe [GlueSchema]) Source #

Specifies the data schema for the S3 CSV source.

s3CsvSource_recurse :: Lens' S3CsvSource (Maybe Bool) Source #

If set to true, recursively reads files in all subdirectories under the specified paths.

s3CsvSource_skipFirst :: Lens' S3CsvSource (Maybe Bool) Source #

A Boolean value that specifies whether to skip the first data line. The default value is False.

s3CsvSource_withHeader :: Lens' S3CsvSource (Maybe Bool) Source #

A Boolean value that specifies whether to treat the first line as a header. The default value is False.

s3CsvSource_writeHeader :: Lens' S3CsvSource (Maybe Bool) Source #

A Boolean value that specifies whether to write the header to output. The default value is True.

s3CsvSource_name :: Lens' S3CsvSource Text Source #

The name of the data store.

s3CsvSource_paths :: Lens' S3CsvSource [Text] Source #

A list of the Amazon S3 paths to read from.

s3CsvSource_separator :: Lens' S3CsvSource Separator Source #

Specifies the delimiter character. The default is a comma: ",", but any other character can be specified.

s3CsvSource_quoteChar :: Lens' S3CsvSource QuoteChar Source #

Specifies the character to use for quoting. The default is a double quote: '"'. Set this to -1 to turn off quoting entirely.

S3DirectSourceAdditionalOptions

s3DirectSourceAdditionalOptions_boundedFiles :: Lens' S3DirectSourceAdditionalOptions (Maybe Integer) Source #

Sets the upper limit for the target number of files that will be processed.

s3DirectSourceAdditionalOptions_boundedSize :: Lens' S3DirectSourceAdditionalOptions (Maybe Integer) Source #

Sets the upper limit for the target size of the dataset in bytes that will be processed.

S3DirectTarget

s3DirectTarget_compression :: Lens' S3DirectTarget (Maybe Text) Source #

Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").

s3DirectTarget_partitionKeys :: Lens' S3DirectTarget (Maybe [[Text]]) Source #

Specifies native partitioning using a sequence of keys.

s3DirectTarget_schemaChangePolicy :: Lens' S3DirectTarget (Maybe DirectSchemaChangePolicy) Source #

A policy that specifies update behavior for the crawler.

s3DirectTarget_name :: Lens' S3DirectTarget Text Source #

The name of the data target.

s3DirectTarget_inputs :: Lens' S3DirectTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

s3DirectTarget_path :: Lens' S3DirectTarget Text Source #

A single Amazon S3 path to write to.

s3DirectTarget_format :: Lens' S3DirectTarget TargetFormat Source #

Specifies the data output format for the target.

S3Encryption

s3Encryption_kmsKeyArn :: Lens' S3Encryption (Maybe Text) Source #

The Amazon Resource Name (ARN) of the KMS key to be used to encrypt the data.

s3Encryption_s3EncryptionMode :: Lens' S3Encryption (Maybe S3EncryptionMode) Source #

The encryption mode to use for Amazon S3 data.

S3GlueParquetTarget

s3GlueParquetTarget_compression :: Lens' S3GlueParquetTarget (Maybe ParquetCompressionType) Source #

Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").

s3GlueParquetTarget_partitionKeys :: Lens' S3GlueParquetTarget (Maybe [[Text]]) Source #

Specifies native partitioning using a sequence of keys.

s3GlueParquetTarget_schemaChangePolicy :: Lens' S3GlueParquetTarget (Maybe DirectSchemaChangePolicy) Source #

A policy that specifies update behavior for the crawler.

s3GlueParquetTarget_inputs :: Lens' S3GlueParquetTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

s3GlueParquetTarget_path :: Lens' S3GlueParquetTarget Text Source #

A single Amazon S3 path to write to.

S3JsonSource

s3JsonSource_compressionType :: Lens' S3JsonSource (Maybe CompressionType) Source #

Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").

s3JsonSource_exclusions :: Lens' S3JsonSource (Maybe [Text]) Source #

A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.

s3JsonSource_groupFiles :: Lens' S3JsonSource (Maybe Text) Source #

Grouping files is turned on by default when the input contains more than 50,000 files. To turn on grouping with fewer than 50,000 files, set this parameter to "inPartition". To disable grouping when there are more than 50,000 files, set this parameter to "none".

s3JsonSource_groupSize :: Lens' S3JsonSource (Maybe Text) Source #

The target group size in bytes. The default is computed based on the input data size and the size of your cluster. When there are fewer than 50,000 input files, "groupFiles" must be set to "inPartition" for this to take effect.

s3JsonSource_jsonPath :: Lens' S3JsonSource (Maybe Text) Source #

A JsonPath string defining the JSON data.

s3JsonSource_maxBand :: Lens' S3JsonSource (Maybe Natural) Source #

This option controls the duration in milliseconds after which the s3 listing is likely to be consistent. Files with modification timestamps falling within the last maxBand milliseconds are tracked specially when using JobBookmarks to account for Amazon S3 eventual consistency. Most users don't need to set this option. The default is 900000 milliseconds, or 15 minutes.

s3JsonSource_maxFilesInBand :: Lens' S3JsonSource (Maybe Natural) Source #

This option specifies the maximum number of files to save from the last maxBand seconds. If this number is exceeded, extra files are skipped and only processed in the next job run.

s3JsonSource_multiline :: Lens' S3JsonSource (Maybe Bool) Source #

A Boolean value that specifies whether a single record can span multiple lines. This can occur when a field contains a quoted new-line character. You must set this option to True if any record spans multiple lines. The default value is False, which allows for more aggressive file-splitting during parsing.

s3JsonSource_outputSchemas :: Lens' S3JsonSource (Maybe [GlueSchema]) Source #

Specifies the data schema for the S3 JSON source.

s3JsonSource_recurse :: Lens' S3JsonSource (Maybe Bool) Source #

If set to true, recursively reads files in all subdirectories under the specified paths.

s3JsonSource_name :: Lens' S3JsonSource Text Source #

The name of the data store.

s3JsonSource_paths :: Lens' S3JsonSource [Text] Source #

A list of the Amazon S3 paths to read from.

S3ParquetSource

s3ParquetSource_compressionType :: Lens' S3ParquetSource (Maybe ParquetCompressionType) Source #

Specifies how the data is compressed. This is generally not necessary if the data has a standard file extension. Possible values are "gzip" and "bzip").

s3ParquetSource_exclusions :: Lens' S3ParquetSource (Maybe [Text]) Source #

A string containing a JSON list of Unix-style glob patterns to exclude. For example, "[\"**.pdf\"]" excludes all PDF files.

s3ParquetSource_groupFiles :: Lens' S3ParquetSource (Maybe Text) Source #

Grouping files is turned on by default when the input contains more than 50,000 files. To turn on grouping with fewer than 50,000 files, set this parameter to "inPartition". To disable grouping when there are more than 50,000 files, set this parameter to "none".

s3ParquetSource_groupSize :: Lens' S3ParquetSource (Maybe Text) Source #

The target group size in bytes. The default is computed based on the input data size and the size of your cluster. When there are fewer than 50,000 input files, "groupFiles" must be set to "inPartition" for this to take effect.

s3ParquetSource_maxBand :: Lens' S3ParquetSource (Maybe Natural) Source #

This option controls the duration in milliseconds after which the s3 listing is likely to be consistent. Files with modification timestamps falling within the last maxBand milliseconds are tracked specially when using JobBookmarks to account for Amazon S3 eventual consistency. Most users don't need to set this option. The default is 900000 milliseconds, or 15 minutes.

s3ParquetSource_maxFilesInBand :: Lens' S3ParquetSource (Maybe Natural) Source #

This option specifies the maximum number of files to save from the last maxBand seconds. If this number is exceeded, extra files are skipped and only processed in the next job run.

s3ParquetSource_outputSchemas :: Lens' S3ParquetSource (Maybe [GlueSchema]) Source #

Specifies the data schema for the S3 Parquet source.

s3ParquetSource_recurse :: Lens' S3ParquetSource (Maybe Bool) Source #

If set to true, recursively reads files in all subdirectories under the specified paths.

s3ParquetSource_name :: Lens' S3ParquetSource Text Source #

The name of the data store.

s3ParquetSource_paths :: Lens' S3ParquetSource [Text] Source #

A list of the Amazon S3 paths to read from.

S3SourceAdditionalOptions

s3SourceAdditionalOptions_boundedFiles :: Lens' S3SourceAdditionalOptions (Maybe Integer) Source #

Sets the upper limit for the target number of files that will be processed.

s3SourceAdditionalOptions_boundedSize :: Lens' S3SourceAdditionalOptions (Maybe Integer) Source #

Sets the upper limit for the target size of the dataset in bytes that will be processed.

S3Target

s3Target_connectionName :: Lens' S3Target (Maybe Text) Source #

The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).

s3Target_dlqEventQueueArn :: Lens' S3Target (Maybe Text) Source #

A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.

s3Target_eventQueueArn :: Lens' S3Target (Maybe Text) Source #

A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.

s3Target_exclusions :: Lens' S3Target (Maybe [Text]) Source #

A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.

s3Target_path :: Lens' S3Target (Maybe Text) Source #

The path to the Amazon S3 target.

s3Target_sampleSize :: Lens' S3Target (Maybe Int) Source #

Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.

Schedule

schedule_scheduleExpression :: Lens' Schedule (Maybe Text) Source #

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

schedule_state :: Lens' Schedule (Maybe ScheduleState) Source #

The state of the schedule.

SchemaChangePolicy

schemaChangePolicy_deleteBehavior :: Lens' SchemaChangePolicy (Maybe DeleteBehavior) Source #

The deletion behavior when the crawler finds a deleted object.

schemaChangePolicy_updateBehavior :: Lens' SchemaChangePolicy (Maybe UpdateBehavior) Source #

The update behavior when the crawler finds a changed schema.

SchemaColumn

schemaColumn_dataType :: Lens' SchemaColumn (Maybe Text) Source #

The type of data in the column.

schemaColumn_name :: Lens' SchemaColumn (Maybe Text) Source #

The name of the column.

SchemaId

schemaId_registryName :: Lens' SchemaId (Maybe Text) Source #

The name of the schema registry that contains the schema.

schemaId_schemaArn :: Lens' SchemaId (Maybe Text) Source #

The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided.

schemaId_schemaName :: Lens' SchemaId (Maybe Text) Source #

The name of the schema. One of SchemaArn or SchemaName has to be provided.

SchemaListItem

schemaListItem_createdTime :: Lens' SchemaListItem (Maybe Text) Source #

The date and time that a schema was created.

schemaListItem_registryName :: Lens' SchemaListItem (Maybe Text) Source #

the name of the registry where the schema resides.

schemaListItem_schemaArn :: Lens' SchemaListItem (Maybe Text) Source #

The Amazon Resource Name (ARN) for the schema.

schemaListItem_updatedTime :: Lens' SchemaListItem (Maybe Text) Source #

The date and time that a schema was updated.

SchemaReference

schemaReference_schemaId :: Lens' SchemaReference (Maybe SchemaId) Source #

A structure that contains schema identity fields. Either this or the SchemaVersionId has to be provided.

schemaReference_schemaVersionId :: Lens' SchemaReference (Maybe Text) Source #

The unique ID assigned to a version of the schema. Either this or the SchemaId has to be provided.

SchemaVersionErrorItem

schemaVersionErrorItem_errorDetails :: Lens' SchemaVersionErrorItem (Maybe ErrorDetails) Source #

The details of the error for the schema version.

SchemaVersionListItem

schemaVersionListItem_createdTime :: Lens' SchemaVersionListItem (Maybe Text) Source #

The date and time the schema version was created.

schemaVersionListItem_schemaArn :: Lens' SchemaVersionListItem (Maybe Text) Source #

The Amazon Resource Name (ARN) of the schema.

schemaVersionListItem_schemaVersionId :: Lens' SchemaVersionListItem (Maybe Text) Source #

The unique identifier of the schema version.

SchemaVersionNumber

schemaVersionNumber_latestVersion :: Lens' SchemaVersionNumber (Maybe Bool) Source #

The latest version available for the schema.

SecurityConfiguration

securityConfiguration_createdTimeStamp :: Lens' SecurityConfiguration (Maybe UTCTime) Source #

The time at which this security configuration was created.

securityConfiguration_encryptionConfiguration :: Lens' SecurityConfiguration (Maybe EncryptionConfiguration) Source #

The encryption configuration associated with this security configuration.

securityConfiguration_name :: Lens' SecurityConfiguration (Maybe Text) Source #

The name of the security configuration.

Segment

segment_segmentNumber :: Lens' Segment Natural Source #

The zero-based index number of the segment. For example, if the total number of segments is 4, SegmentNumber values range from 0 through 3.

segment_totalSegments :: Lens' Segment Natural Source #

The total number of segments.

SelectFields

selectFields_name :: Lens' SelectFields Text Source #

The name of the transform node.

selectFields_inputs :: Lens' SelectFields (NonEmpty Text) Source #

The data inputs identified by their node names.

selectFields_paths :: Lens' SelectFields [[Text]] Source #

A JSON path to a variable in the data structure.

SelectFromCollection

selectFromCollection_inputs :: Lens' SelectFromCollection (NonEmpty Text) Source #

The data inputs identified by their node names.

selectFromCollection_index :: Lens' SelectFromCollection Natural Source #

The index for the DynamicFrame to be selected.

SerDeInfo

serDeInfo_parameters :: Lens' SerDeInfo (Maybe (HashMap Text Text)) Source #

These key-value pairs define initialization parameters for the SerDe.

serDeInfo_serializationLibrary :: Lens' SerDeInfo (Maybe Text) Source #

Usually the class that implements the SerDe. An example is org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.

Session

session_command :: Lens' Session (Maybe SessionCommand) Source #

The command object.See SessionCommand.

session_connections :: Lens' Session (Maybe ConnectionsList) Source #

The number of connections used for the session.

session_createdOn :: Lens' Session (Maybe UTCTime) Source #

The time and date when the session was created.

session_defaultArguments :: Lens' Session (Maybe (HashMap Text Text)) Source #

A map array of key-value pairs. Max is 75 pairs.

session_description :: Lens' Session (Maybe Text) Source #

The description of the session.

session_errorMessage :: Lens' Session (Maybe Text) Source #

The error message displayed during the session.

session_glueVersion :: Lens' Session (Maybe Text) Source #

The Glue version determines the versions of Apache Spark and Python that Glue supports. The GlueVersion must be greater than 2.0.

session_id :: Lens' Session (Maybe Text) Source #

The ID of the session.

session_maxCapacity :: Lens' Session (Maybe Double) Source #

The number of Glue data processing units (DPUs) that can be allocated when the job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB memory.

session_progress :: Lens' Session (Maybe Double) Source #

The code execution progress of the session.

session_role :: Lens' Session (Maybe Text) Source #

The name or Amazon Resource Name (ARN) of the IAM role associated with the Session.

session_securityConfiguration :: Lens' Session (Maybe Text) Source #

The name of the SecurityConfiguration structure to be used with the session.

SessionCommand

sessionCommand_name :: Lens' SessionCommand (Maybe Text) Source #

Specifies the name of the SessionCommand. Can be 'glueetl' or 'gluestreaming'.

sessionCommand_pythonVersion :: Lens' SessionCommand (Maybe Text) Source #

Specifies the Python version. The Python version indicates the version supported for jobs of type Spark.

SkewedInfo

skewedInfo_skewedColumnNames :: Lens' SkewedInfo (Maybe [Text]) Source #

A list of names of columns that contain skewed values.

skewedInfo_skewedColumnValueLocationMaps :: Lens' SkewedInfo (Maybe (HashMap Text Text)) Source #

A mapping of skewed values to the columns that contain them.

skewedInfo_skewedColumnValues :: Lens' SkewedInfo (Maybe [Text]) Source #

A list of values that appear so frequently as to be considered skewed.

SortCriterion

sortCriterion_fieldName :: Lens' SortCriterion (Maybe Text) Source #

The name of the field on which to sort.

sortCriterion_sort :: Lens' SortCriterion (Maybe Sort) Source #

An ascending or descending sort.

SourceControlDetails

sourceControlDetails_authStrategy :: Lens' SourceControlDetails (Maybe SourceControlAuthStrategy) Source #

The type of authentication, which can be an authentication token stored in Amazon Web Services Secrets Manager, or a personal access token.

sourceControlDetails_branch :: Lens' SourceControlDetails (Maybe Text) Source #

An optional branch in the remote repository.

sourceControlDetails_folder :: Lens' SourceControlDetails (Maybe Text) Source #

An optional folder in the remote repository.

sourceControlDetails_lastCommitId :: Lens' SourceControlDetails (Maybe Text) Source #

The last commit ID for a commit in the remote repository.

sourceControlDetails_owner :: Lens' SourceControlDetails (Maybe Text) Source #

The owner of the remote repository that contains the job artifacts.

sourceControlDetails_repository :: Lens' SourceControlDetails (Maybe Text) Source #

The name of the remote repository that contains the job artifacts.

SparkConnectorSource

sparkConnectorSource_additionalOptions :: Lens' SparkConnectorSource (Maybe (HashMap Text Text)) Source #

Additional connection options for the connector.

sparkConnectorSource_outputSchemas :: Lens' SparkConnectorSource (Maybe [GlueSchema]) Source #

Specifies data schema for the custom spark source.

sparkConnectorSource_connectionName :: Lens' SparkConnectorSource Text Source #

The name of the connection that is associated with the connector.

sparkConnectorSource_connectorName :: Lens' SparkConnectorSource Text Source #

The name of a connector that assists with accessing the data store in Glue Studio.

sparkConnectorSource_connectionType :: Lens' SparkConnectorSource Text Source #

The type of connection, such as marketplace.spark or custom.spark, designating a connection to an Apache Spark data store.

SparkConnectorTarget

sparkConnectorTarget_additionalOptions :: Lens' SparkConnectorTarget (Maybe (HashMap Text Text)) Source #

Additional connection options for the connector.

sparkConnectorTarget_outputSchemas :: Lens' SparkConnectorTarget (Maybe [GlueSchema]) Source #

Specifies the data schema for the custom spark target.

sparkConnectorTarget_inputs :: Lens' SparkConnectorTarget (NonEmpty Text) Source #

The nodes that are inputs to the data target.

sparkConnectorTarget_connectionName :: Lens' SparkConnectorTarget Text Source #

The name of a connection for an Apache Spark connector.

sparkConnectorTarget_connectionType :: Lens' SparkConnectorTarget Text Source #

The type of connection, such as marketplace.spark or custom.spark, designating a connection to an Apache Spark data store.

SparkSQL

sparkSQL_outputSchemas :: Lens' SparkSQL (Maybe [GlueSchema]) Source #

Specifies the data schema for the SparkSQL transform.

sparkSQL_name :: Lens' SparkSQL Text Source #

The name of the transform node.

sparkSQL_inputs :: Lens' SparkSQL (NonEmpty Text) Source #

The data inputs identified by their node names. You can associate a table name with each input node to use in the SQL query. The name you choose must meet the Spark SQL naming restrictions.

sparkSQL_sqlQuery :: Lens' SparkSQL Text Source #

A SQL query that must use Spark SQL syntax and return a single data set.

sparkSQL_sqlAliases :: Lens' SparkSQL [SqlAlias] Source #

A list of aliases. An alias allows you to specify what name to use in the SQL for a given input. For example, you have a datasource named "MyDataSource". If you specify From as MyDataSource, and Alias as SqlName, then in your SQL you can do:

select * from SqlName

and that gets data from MyDataSource.

Spigot

spigot_prob :: Lens' Spigot (Maybe Double) Source #

The probability (a decimal value with a maximum value of 1) of picking any given record. A value of 1 indicates that each row read from the dataset should be included in the sample output.

spigot_topk :: Lens' Spigot (Maybe Natural) Source #

Specifies a number of records to write starting from the beginning of the dataset.

spigot_name :: Lens' Spigot Text Source #

The name of the transform node.

spigot_inputs :: Lens' Spigot (NonEmpty Text) Source #

The data inputs identified by their node names.

spigot_path :: Lens' Spigot Text Source #

A path in Amazon S3 where the transform will write a subset of records from the dataset to a JSON file in an Amazon S3 bucket.

SplitFields

splitFields_name :: Lens' SplitFields Text Source #

The name of the transform node.

splitFields_inputs :: Lens' SplitFields (NonEmpty Text) Source #

The data inputs identified by their node names.

splitFields_paths :: Lens' SplitFields [[Text]] Source #

A JSON path to a variable in the data structure.

SqlAlias

sqlAlias_from :: Lens' SqlAlias Text Source #

A table, or a column in a table.

sqlAlias_alias :: Lens' SqlAlias Text Source #

A temporary name given to a table, or a column in a table.

StartingEventBatchCondition

Statement

statement_code :: Lens' Statement (Maybe Text) Source #

The execution code of the statement.

statement_completedOn :: Lens' Statement (Maybe Integer) Source #

The unix time and date that the job definition was completed.

statement_id :: Lens' Statement (Maybe Int) Source #

The ID of the statement.

statement_progress :: Lens' Statement (Maybe Double) Source #

The code execution progress.

statement_startedOn :: Lens' Statement (Maybe Integer) Source #

The unix time and date that the job definition was started.

statement_state :: Lens' Statement (Maybe StatementState) Source #

The state while request is actioned.

StatementOutput

statementOutput_errorName :: Lens' StatementOutput (Maybe Text) Source #

The name of the error in the output.

statementOutput_executionCount :: Lens' StatementOutput (Maybe Int) Source #

The execution count of the output.

statementOutput_status :: Lens' StatementOutput (Maybe StatementState) Source #

The status of the code execution output.

statementOutput_traceback :: Lens' StatementOutput (Maybe [Text]) Source #

The traceback of the output.

StatementOutputData

statementOutputData_textPlain :: Lens' StatementOutputData (Maybe Text) Source #

The code execution output in text format.

StorageDescriptor

storageDescriptor_additionalLocations :: Lens' StorageDescriptor (Maybe [Text]) Source #

A list of locations that point to the path where a Delta table is located.

storageDescriptor_bucketColumns :: Lens' StorageDescriptor (Maybe [Text]) Source #

A list of reducer grouping columns, clustering columns, and bucketing columns in the table.

storageDescriptor_columns :: Lens' StorageDescriptor (Maybe [Column]) Source #

A list of the Columns in the table.

storageDescriptor_compressed :: Lens' StorageDescriptor (Maybe Bool) Source #

True if the data in the table is compressed, or False if not.

storageDescriptor_inputFormat :: Lens' StorageDescriptor (Maybe Text) Source #

The input format: SequenceFileInputFormat (binary), or TextInputFormat, or a custom format.

storageDescriptor_location :: Lens' StorageDescriptor (Maybe Text) Source #

The physical location of the table. By default, this takes the form of the warehouse location, followed by the database location in the warehouse, followed by the table name.

storageDescriptor_numberOfBuckets :: Lens' StorageDescriptor (Maybe Int) Source #

Must be specified if the table contains any dimension columns.

storageDescriptor_outputFormat :: Lens' StorageDescriptor (Maybe Text) Source #

The output format: SequenceFileOutputFormat (binary), or IgnoreKeyTextOutputFormat, or a custom format.

storageDescriptor_parameters :: Lens' StorageDescriptor (Maybe (HashMap Text Text)) Source #

The user-supplied properties in key-value form.

storageDescriptor_schemaReference :: Lens' StorageDescriptor (Maybe SchemaReference) Source #

An object that references a schema stored in the Glue Schema Registry.

When creating a table, you can pass an empty list of columns for the schema, and instead use a schema reference.

storageDescriptor_serdeInfo :: Lens' StorageDescriptor (Maybe SerDeInfo) Source #

The serialization/deserialization (SerDe) information.

storageDescriptor_skewedInfo :: Lens' StorageDescriptor (Maybe SkewedInfo) Source #

The information about values that appear frequently in a column (skewed values).

storageDescriptor_sortColumns :: Lens' StorageDescriptor (Maybe [Order]) Source #

A list specifying the sort order of each bucket in the table.

storageDescriptor_storedAsSubDirectories :: Lens' StorageDescriptor (Maybe Bool) Source #

True if the table data is stored in subdirectories, or False if not.

StreamingDataPreviewOptions

StringColumnStatisticsData

Table

table_catalogId :: Lens' Table (Maybe Text) Source #

The ID of the Data Catalog in which the table resides.

table_createTime :: Lens' Table (Maybe UTCTime) Source #

The time when the table definition was created in the Data Catalog.

table_createdBy :: Lens' Table (Maybe Text) Source #

The person or entity who created the table.

table_databaseName :: Lens' Table (Maybe Text) Source #

The name of the database where the table metadata resides. For Hive compatibility, this must be all lowercase.

table_description :: Lens' Table (Maybe Text) Source #

A description of the table.

table_isRegisteredWithLakeFormation :: Lens' Table (Maybe Bool) Source #

Indicates whether the table has been registered with Lake Formation.

table_lastAccessTime :: Lens' Table (Maybe UTCTime) Source #

The last time that the table was accessed. This is usually taken from HDFS, and might not be reliable.

table_lastAnalyzedTime :: Lens' Table (Maybe UTCTime) Source #

The last time that column statistics were computed for this table.

table_owner :: Lens' Table (Maybe Text) Source #

The owner of the table.

table_parameters :: Lens' Table (Maybe (HashMap Text Text)) Source #

These key-value pairs define properties associated with the table.

table_partitionKeys :: Lens' Table (Maybe [Column]) Source #

A list of columns by which the table is partitioned. Only primitive types are supported as partition keys.

When you create a table used by Amazon Athena, and you do not specify any partitionKeys, you must at least set the value of partitionKeys to an empty list. For example:

"PartitionKeys": []

table_retention :: Lens' Table (Maybe Natural) Source #

The retention time for this table.

table_storageDescriptor :: Lens' Table (Maybe StorageDescriptor) Source #

A storage descriptor containing information about the physical storage of this table.

table_tableType :: Lens' Table (Maybe Text) Source #

The type of this table (EXTERNAL_TABLE, VIRTUAL_VIEW, etc.).

table_targetTable :: Lens' Table (Maybe TableIdentifier) Source #

A TableIdentifier structure that describes a target table for resource linking.

table_updateTime :: Lens' Table (Maybe UTCTime) Source #

The last time that the table was updated.

table_versionId :: Lens' Table (Maybe Text) Source #

The ID of the table version.

table_viewExpandedText :: Lens' Table (Maybe Text) Source #

If the table is a view, the expanded text of the view; otherwise null.

table_viewOriginalText :: Lens' Table (Maybe Text) Source #

If the table is a view, the original text of the view; otherwise null.

table_name :: Lens' Table Text Source #

The table name. For Hive compatibility, this must be entirely lowercase.

TableError

tableError_tableName :: Lens' TableError (Maybe Text) Source #

The name of the table. For Hive compatibility, this must be entirely lowercase.

TableIdentifier

tableIdentifier_catalogId :: Lens' TableIdentifier (Maybe Text) Source #

The ID of the Data Catalog in which the table resides.

tableIdentifier_databaseName :: Lens' TableIdentifier (Maybe Text) Source #

The name of the catalog database that contains the target table.

tableIdentifier_name :: Lens' TableIdentifier (Maybe Text) Source #

The name of the target table.

TableInput

tableInput_description :: Lens' TableInput (Maybe Text) Source #

A description of the table.

tableInput_lastAccessTime :: Lens' TableInput (Maybe UTCTime) Source #

The last time that the table was accessed.

tableInput_lastAnalyzedTime :: Lens' TableInput (Maybe UTCTime) Source #

The last time that column statistics were computed for this table.

tableInput_parameters :: Lens' TableInput (Maybe (HashMap Text Text)) Source #

These key-value pairs define properties associated with the table.

tableInput_partitionKeys :: Lens' TableInput (Maybe [Column]) Source #

A list of columns by which the table is partitioned. Only primitive types are supported as partition keys.

When you create a table used by Amazon Athena, and you do not specify any partitionKeys, you must at least set the value of partitionKeys to an empty list. For example:

"PartitionKeys": []

tableInput_retention :: Lens' TableInput (Maybe Natural) Source #

The retention time for this table.

tableInput_storageDescriptor :: Lens' TableInput (Maybe StorageDescriptor) Source #

A storage descriptor containing information about the physical storage of this table.

tableInput_tableType :: Lens' TableInput (Maybe Text) Source #

The type of this table (EXTERNAL_TABLE, VIRTUAL_VIEW, etc.).

tableInput_targetTable :: Lens' TableInput (Maybe TableIdentifier) Source #

A TableIdentifier structure that describes a target table for resource linking.

tableInput_viewExpandedText :: Lens' TableInput (Maybe Text) Source #

If the table is a view, the expanded text of the view; otherwise null.

tableInput_viewOriginalText :: Lens' TableInput (Maybe Text) Source #

If the table is a view, the original text of the view; otherwise null.

tableInput_name :: Lens' TableInput Text Source #

The table name. For Hive compatibility, this is folded to lowercase when it is stored.

TableVersion

tableVersion_versionId :: Lens' TableVersion (Maybe Text) Source #

The ID value that identifies this table version. A VersionId is a string representation of an integer. Each version is incremented by 1.

TableVersionError

tableVersionError_tableName :: Lens' TableVersionError (Maybe Text) Source #

The name of the table in question.

tableVersionError_versionId :: Lens' TableVersionError (Maybe Text) Source #

The ID value of the version in question. A VersionID is a string representation of an integer. Each version is incremented by 1.

TaskRun

taskRun_completedOn :: Lens' TaskRun (Maybe UTCTime) Source #

The last point in time that the requested task run was completed.

taskRun_errorString :: Lens' TaskRun (Maybe Text) Source #

The list of error strings associated with this task run.

taskRun_executionTime :: Lens' TaskRun (Maybe Int) Source #

The amount of time (in seconds) that the task run consumed resources.

taskRun_lastModifiedOn :: Lens' TaskRun (Maybe UTCTime) Source #

The last point in time that the requested task run was updated.

taskRun_logGroupName :: Lens' TaskRun (Maybe Text) Source #

The names of the log group for secure logging, associated with this task run.

taskRun_properties :: Lens' TaskRun (Maybe TaskRunProperties) Source #

Specifies configuration properties associated with this task run.

taskRun_startedOn :: Lens' TaskRun (Maybe UTCTime) Source #

The date and time that this task run started.

taskRun_status :: Lens' TaskRun (Maybe TaskStatusType) Source #

The current status of the requested task run.

taskRun_taskRunId :: Lens' TaskRun (Maybe Text) Source #

The unique identifier for this task run.

taskRun_transformId :: Lens' TaskRun (Maybe Text) Source #

The unique identifier for the transform.

TaskRunFilterCriteria

taskRunFilterCriteria_startedAfter :: Lens' TaskRunFilterCriteria (Maybe UTCTime) Source #

Filter on task runs started after this date.

taskRunFilterCriteria_startedBefore :: Lens' TaskRunFilterCriteria (Maybe UTCTime) Source #

Filter on task runs started before this date.

TaskRunProperties

TaskRunSortCriteria

taskRunSortCriteria_column :: Lens' TaskRunSortCriteria TaskRunSortColumnType Source #

The column to be used to sort the list of task runs for the machine learning transform.

taskRunSortCriteria_sortDirection :: Lens' TaskRunSortCriteria SortDirectionType Source #

The sort direction to be used to sort the list of task runs for the machine learning transform.

TransformConfigParameter

transformConfigParameter_isOptional :: Lens' TransformConfigParameter (Maybe Bool) Source #

Specifies whether the parameter is optional or not in the config file of the dynamic transform.

transformConfigParameter_listType :: Lens' TransformConfigParameter (Maybe ParamType) Source #

Specifies the list type of the parameter in the config file of the dynamic transform.

transformConfigParameter_validationMessage :: Lens' TransformConfigParameter (Maybe Text) Source #

Specifies the validation message in the config file of the dynamic transform.

transformConfigParameter_validationRule :: Lens' TransformConfigParameter (Maybe Text) Source #

Specifies the validation rule in the config file of the dynamic transform.

transformConfigParameter_value :: Lens' TransformConfigParameter (Maybe [Text]) Source #

Specifies the value of the parameter in the config file of the dynamic transform.

transformConfigParameter_name :: Lens' TransformConfigParameter Text Source #

Specifies the name of the parameter in the config file of the dynamic transform.

transformConfigParameter_type :: Lens' TransformConfigParameter ParamType Source #

Specifies the parameter type in the config file of the dynamic transform.

TransformEncryption

transformEncryption_mlUserDataEncryption :: Lens' TransformEncryption (Maybe MLUserDataEncryption) Source #

An MLUserDataEncryption object containing the encryption mode and customer-provided KMS key ID.

TransformFilterCriteria

transformFilterCriteria_createdAfter :: Lens' TransformFilterCriteria (Maybe UTCTime) Source #

The time and date after which the transforms were created.

transformFilterCriteria_createdBefore :: Lens' TransformFilterCriteria (Maybe UTCTime) Source #

The time and date before which the transforms were created.

transformFilterCriteria_glueVersion :: Lens' TransformFilterCriteria (Maybe Text) Source #

This value determines which version of Glue this machine learning transform is compatible with. Glue 1.0 is recommended for most customers. If the value is not set, the Glue compatibility defaults to Glue 0.9. For more information, see Glue Versions in the developer guide.

transformFilterCriteria_lastModifiedAfter :: Lens' TransformFilterCriteria (Maybe UTCTime) Source #

Filter on transforms last modified after this date.

transformFilterCriteria_lastModifiedBefore :: Lens' TransformFilterCriteria (Maybe UTCTime) Source #

Filter on transforms last modified before this date.

transformFilterCriteria_name :: Lens' TransformFilterCriteria (Maybe Text) Source #

A unique transform name that is used to filter the machine learning transforms.

transformFilterCriteria_schema :: Lens' TransformFilterCriteria (Maybe [SchemaColumn]) Source #

Filters on datasets with a specific schema. The Map<Column, Type> object is an array of key-value pairs representing the schema this transform accepts, where Column is the name of a column, and Type is the type of the data such as an integer or string. Has an upper bound of 100 columns.

transformFilterCriteria_status :: Lens' TransformFilterCriteria (Maybe TransformStatusType) Source #

Filters the list of machine learning transforms by the last known status of the transforms (to indicate whether a transform can be used or not). One of "NOT_READY", "READY", or "DELETING".

transformFilterCriteria_transformType :: Lens' TransformFilterCriteria (Maybe TransformType) Source #

The type of machine learning transform that is used to filter the machine learning transforms.

TransformParameters

transformParameters_transformType :: Lens' TransformParameters TransformType Source #

The type of machine learning transform.

For information about the types of machine learning transforms, see Creating Machine Learning Transforms.

TransformSortCriteria

transformSortCriteria_column :: Lens' TransformSortCriteria TransformSortColumnType Source #

The column to be used in the sorting criteria that are associated with the machine learning transform.

transformSortCriteria_sortDirection :: Lens' TransformSortCriteria SortDirectionType Source #

The sort direction to be used in the sorting criteria that are associated with the machine learning transform.

Trigger

trigger_actions :: Lens' Trigger (Maybe [Action]) Source #

The actions initiated by this trigger.

trigger_description :: Lens' Trigger (Maybe Text) Source #

A description of this trigger.

trigger_eventBatchingCondition :: Lens' Trigger (Maybe EventBatchingCondition) Source #

Batch condition that must be met (specified number of events received or batch time window expired) before EventBridge event trigger fires.

trigger_id :: Lens' Trigger (Maybe Text) Source #

Reserved for future use.

trigger_name :: Lens' Trigger (Maybe Text) Source #

The name of the trigger.

trigger_predicate :: Lens' Trigger (Maybe Predicate) Source #

The predicate of this trigger, which defines when it will fire.

trigger_schedule :: Lens' Trigger (Maybe Text) Source #

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

trigger_state :: Lens' Trigger (Maybe TriggerState) Source #

The current state of the trigger.

trigger_type :: Lens' Trigger (Maybe TriggerType) Source #

The type of trigger that this is.

trigger_workflowName :: Lens' Trigger (Maybe Text) Source #

The name of the workflow associated with the trigger.

TriggerNodeDetails

triggerNodeDetails_trigger :: Lens' TriggerNodeDetails (Maybe Trigger) Source #

The information of the trigger represented by the trigger node.

TriggerUpdate

triggerUpdate_actions :: Lens' TriggerUpdate (Maybe [Action]) Source #

The actions initiated by this trigger.

triggerUpdate_description :: Lens' TriggerUpdate (Maybe Text) Source #

A description of this trigger.

triggerUpdate_eventBatchingCondition :: Lens' TriggerUpdate (Maybe EventBatchingCondition) Source #

Batch condition that must be met (specified number of events received or batch time window expired) before EventBridge event trigger fires.

triggerUpdate_name :: Lens' TriggerUpdate (Maybe Text) Source #

Reserved for future use.

triggerUpdate_predicate :: Lens' TriggerUpdate (Maybe Predicate) Source #

The predicate of this trigger, which defines when it will fire.

triggerUpdate_schedule :: Lens' TriggerUpdate (Maybe Text) Source #

A cron expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

UnfilteredPartition

Union

union_name :: Lens' Union Text Source #

The name of the transform node.

union_inputs :: Lens' Union (NonEmpty Text) Source #

The node ID inputs to the transform.

union_unionType :: Lens' Union UnionType Source #

Indicates the type of Union transform.

Specify ALL to join all rows from data sources to the resulting DynamicFrame. The resulting union does not remove duplicate rows.

Specify DISTINCT to remove duplicate rows in the resulting DynamicFrame.

UpdateCsvClassifierRequest

updateCsvClassifierRequest_allowSingleColumn :: Lens' UpdateCsvClassifierRequest (Maybe Bool) Source #

Enables the processing of files that contain only one column.

updateCsvClassifierRequest_customDatatypes :: Lens' UpdateCsvClassifierRequest (Maybe [Text]) Source #

Specifies a list of supported custom datatypes.

updateCsvClassifierRequest_delimiter :: Lens' UpdateCsvClassifierRequest (Maybe Text) Source #

A custom symbol to denote what separates each column entry in the row.

updateCsvClassifierRequest_disableValueTrimming :: Lens' UpdateCsvClassifierRequest (Maybe Bool) Source #

Specifies not to trim values before identifying the type of column values. The default value is true.

updateCsvClassifierRequest_header :: Lens' UpdateCsvClassifierRequest (Maybe [Text]) Source #

A list of strings representing column names.

updateCsvClassifierRequest_quoteSymbol :: Lens' UpdateCsvClassifierRequest (Maybe Text) Source #

A custom symbol to denote what combines content into a single column value. It must be different from the column delimiter.

UpdateGrokClassifierRequest

updateGrokClassifierRequest_classification :: Lens' UpdateGrokClassifierRequest (Maybe Text) Source #

An identifier of the data format that the classifier matches, such as Twitter, JSON, Omniture logs, Amazon CloudWatch Logs, and so on.

updateGrokClassifierRequest_customPatterns :: Lens' UpdateGrokClassifierRequest (Maybe Text) Source #

Optional custom grok patterns used by this classifier.

UpdateJsonClassifierRequest

updateJsonClassifierRequest_jsonPath :: Lens' UpdateJsonClassifierRequest (Maybe Text) Source #

A JsonPath string defining the JSON data for the classifier to classify. Glue supports a subset of JsonPath, as described in Writing JsonPath Custom Classifiers.

UpdateXMLClassifierRequest

updateXMLClassifierRequest_classification :: Lens' UpdateXMLClassifierRequest (Maybe Text) Source #

An identifier of the data format that the classifier matches.

updateXMLClassifierRequest_rowTag :: Lens' UpdateXMLClassifierRequest (Maybe Text) Source #

The XML tag designating the element that contains each record in an XML document being parsed. This cannot identify a self-closing element (closed by />). An empty row element that contains only attributes can be parsed as long as it ends with a closing tag (for example, <row item_a="A" item_b="B"></row> is okay, but <row item_a="A" item_b="B" /> is not).

UpsertRedshiftTargetOptions

upsertRedshiftTargetOptions_connectionName :: Lens' UpsertRedshiftTargetOptions (Maybe Text) Source #

The name of the connection to use to write to Redshift.

upsertRedshiftTargetOptions_upsertKeys :: Lens' UpsertRedshiftTargetOptions (Maybe [Text]) Source #

The keys used to determine whether to perform an update or insert.

UserDefinedFunction

userDefinedFunction_catalogId :: Lens' UserDefinedFunction (Maybe Text) Source #

The ID of the Data Catalog in which the function resides.

userDefinedFunction_className :: Lens' UserDefinedFunction (Maybe Text) Source #

The Java class that contains the function code.

userDefinedFunction_createTime :: Lens' UserDefinedFunction (Maybe UTCTime) Source #

The time at which the function was created.

userDefinedFunction_databaseName :: Lens' UserDefinedFunction (Maybe Text) Source #

The name of the catalog database that contains the function.

UserDefinedFunctionInput

userDefinedFunctionInput_className :: Lens' UserDefinedFunctionInput (Maybe Text) Source #

The Java class that contains the function code.

Workflow

workflow_blueprintDetails :: Lens' Workflow (Maybe BlueprintDetails) Source #

This structure indicates the details of the blueprint that this particular workflow is created from.

workflow_createdOn :: Lens' Workflow (Maybe UTCTime) Source #

The date and time when the workflow was created.

workflow_defaultRunProperties :: Lens' Workflow (Maybe (HashMap Text Text)) Source #

A collection of properties to be used as part of each execution of the workflow. The run properties are made available to each job in the workflow. A job can modify the properties for the next jobs in the flow.

workflow_description :: Lens' Workflow (Maybe Text) Source #

A description of the workflow.

workflow_graph :: Lens' Workflow (Maybe WorkflowGraph) Source #

The graph representing all the Glue components that belong to the workflow as nodes and directed connections between them as edges.

workflow_lastModifiedOn :: Lens' Workflow (Maybe UTCTime) Source #

The date and time when the workflow was last modified.

workflow_lastRun :: Lens' Workflow (Maybe WorkflowRun) Source #

The information about the last execution of the workflow.

workflow_maxConcurrentRuns :: Lens' Workflow (Maybe Int) Source #

You can use this parameter to prevent unwanted multiple updates to data, to control costs, or in some cases, to prevent exceeding the maximum number of concurrent runs of any of the component jobs. If you leave this parameter blank, there is no limit to the number of concurrent workflow runs.

workflow_name :: Lens' Workflow (Maybe Text) Source #

The name of the workflow.

WorkflowGraph

workflowGraph_edges :: Lens' WorkflowGraph (Maybe [Edge]) Source #

A list of all the directed connections between the nodes belonging to the workflow.

workflowGraph_nodes :: Lens' WorkflowGraph (Maybe [Node]) Source #

A list of the the Glue components belong to the workflow represented as nodes.

WorkflowRun

workflowRun_completedOn :: Lens' WorkflowRun (Maybe UTCTime) Source #

The date and time when the workflow run completed.

workflowRun_errorMessage :: Lens' WorkflowRun (Maybe Text) Source #

This error message describes any error that may have occurred in starting the workflow run. Currently the only error message is "Concurrent runs exceeded for workflow: foo."

workflowRun_graph :: Lens' WorkflowRun (Maybe WorkflowGraph) Source #

The graph representing all the Glue components that belong to the workflow as nodes and directed connections between them as edges.

workflowRun_name :: Lens' WorkflowRun (Maybe Text) Source #

Name of the workflow that was run.

workflowRun_previousRunId :: Lens' WorkflowRun (Maybe Text) Source #

The ID of the previous workflow run.

workflowRun_startedOn :: Lens' WorkflowRun (Maybe UTCTime) Source #

The date and time when the workflow run was started.

workflowRun_status :: Lens' WorkflowRun (Maybe WorkflowRunStatus) Source #

The status of the workflow run.

workflowRun_workflowRunId :: Lens' WorkflowRun (Maybe Text) Source #

The ID of this workflow run.

workflowRun_workflowRunProperties :: Lens' WorkflowRun (Maybe (HashMap Text Text)) Source #

The workflow run properties which were set during the run.

WorkflowRunStatistics

workflowRunStatistics_erroredActions :: Lens' WorkflowRunStatistics (Maybe Int) Source #

Indicates the count of job runs in the ERROR state in the workflow run.

workflowRunStatistics_failedActions :: Lens' WorkflowRunStatistics (Maybe Int) Source #

Total number of Actions that have failed.

workflowRunStatistics_stoppedActions :: Lens' WorkflowRunStatistics (Maybe Int) Source #

Total number of Actions that have stopped.

workflowRunStatistics_succeededActions :: Lens' WorkflowRunStatistics (Maybe Int) Source #

Total number of Actions that have succeeded.

workflowRunStatistics_totalActions :: Lens' WorkflowRunStatistics (Maybe Int) Source #

Total number of Actions in the workflow run.

workflowRunStatistics_waitingActions :: Lens' WorkflowRunStatistics (Maybe Int) Source #

Indicates the count of job runs in WAITING state in the workflow run.

XMLClassifier

xMLClassifier_creationTime :: Lens' XMLClassifier (Maybe UTCTime) Source #

The time that this classifier was registered.

xMLClassifier_lastUpdated :: Lens' XMLClassifier (Maybe UTCTime) Source #

The time that this classifier was last updated.

xMLClassifier_rowTag :: Lens' XMLClassifier (Maybe Text) Source #

The XML tag designating the element that contains each record in an XML document being parsed. This can't identify a self-closing element (closed by />). An empty row element that contains only attributes can be parsed as long as it ends with a closing tag (for example, <row item_a="A" item_b="B"></row> is okay, but <row item_a="A" item_b="B" /> is not).

xMLClassifier_version :: Lens' XMLClassifier (Maybe Integer) Source #

The version of this classifier.

xMLClassifier_name :: Lens' XMLClassifier Text Source #

The name of the classifier.

xMLClassifier_classification :: Lens' XMLClassifier Text Source #

An identifier of the data format that the classifier matches.