amazonka-comprehend-2.0: Amazon Comprehend SDK.
Copyright(c) 2013-2023 Brendan Hay
LicenseMozilla Public License, v. 2.0.
MaintainerBrendan Hay
Stabilityauto-generated
Portabilitynon-portable (GHC extensions)
Safe HaskellSafe-Inferred
LanguageHaskell2010

Amazonka.Comprehend.Types.InputDataConfig

Description

 
Synopsis

Documentation

data InputDataConfig Source #

The input properties for an inference job. The document reader config field applies only to non-text inputs for custom analysis.

See: newInputDataConfig smart constructor.

Constructors

InputDataConfig' 

Fields

  • documentReaderConfig :: Maybe DocumentReaderConfig

    Provides configuration parameters to override the default actions for extracting text from PDF documents and image files.

  • inputFormat :: Maybe InputFormat

    Specifies how the text in an input file should be processed:

    • ONE_DOC_PER_FILE - Each file is considered a separate document. Use this option when you are processing large documents, such as newspaper articles or scientific papers.
    • ONE_DOC_PER_LINE - Each line in a file is considered a separate document. Use this option when you are processing many short documents, such as text messages.
  • s3Uri :: Text

    The Amazon S3 URI for the input data. The URI must be in same region as the API endpoint that you are calling. The URI can point to a single input file or it can provide the prefix for a collection of data files.

    For example, if you use the URI S3://bucketName/prefix, if the prefix is a single file, Amazon Comprehend uses that file as input. If more than one file begins with the prefix, Amazon Comprehend uses all of them as input.

Instances

Instances details
FromJSON InputDataConfig Source # 
Instance details

Defined in Amazonka.Comprehend.Types.InputDataConfig

ToJSON InputDataConfig Source # 
Instance details

Defined in Amazonka.Comprehend.Types.InputDataConfig

Generic InputDataConfig Source # 
Instance details

Defined in Amazonka.Comprehend.Types.InputDataConfig

Associated Types

type Rep InputDataConfig :: Type -> Type #

Read InputDataConfig Source # 
Instance details

Defined in Amazonka.Comprehend.Types.InputDataConfig

Show InputDataConfig Source # 
Instance details

Defined in Amazonka.Comprehend.Types.InputDataConfig

NFData InputDataConfig Source # 
Instance details

Defined in Amazonka.Comprehend.Types.InputDataConfig

Methods

rnf :: InputDataConfig -> () #

Eq InputDataConfig Source # 
Instance details

Defined in Amazonka.Comprehend.Types.InputDataConfig

Hashable InputDataConfig Source # 
Instance details

Defined in Amazonka.Comprehend.Types.InputDataConfig

type Rep InputDataConfig Source # 
Instance details

Defined in Amazonka.Comprehend.Types.InputDataConfig

type Rep InputDataConfig = D1 ('MetaData "InputDataConfig" "Amazonka.Comprehend.Types.InputDataConfig" "amazonka-comprehend-2.0-Ko6GCjAQF2RARapSdPn69F" 'False) (C1 ('MetaCons "InputDataConfig'" 'PrefixI 'True) (S1 ('MetaSel ('Just "documentReaderConfig") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe DocumentReaderConfig)) :*: (S1 ('MetaSel ('Just "inputFormat") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe InputFormat)) :*: S1 ('MetaSel ('Just "s3Uri") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 Text))))

newInputDataConfig Source #

Create a value of InputDataConfig with all optional fields omitted.

Use generic-lens or optics to modify other optional fields.

The following record fields are available, with the corresponding lenses provided for backwards compatibility:

$sel:documentReaderConfig:InputDataConfig', inputDataConfig_documentReaderConfig - Provides configuration parameters to override the default actions for extracting text from PDF documents and image files.

$sel:inputFormat:InputDataConfig', inputDataConfig_inputFormat - Specifies how the text in an input file should be processed:

  • ONE_DOC_PER_FILE - Each file is considered a separate document. Use this option when you are processing large documents, such as newspaper articles or scientific papers.
  • ONE_DOC_PER_LINE - Each line in a file is considered a separate document. Use this option when you are processing many short documents, such as text messages.

$sel:s3Uri:InputDataConfig', inputDataConfig_s3Uri - The Amazon S3 URI for the input data. The URI must be in same region as the API endpoint that you are calling. The URI can point to a single input file or it can provide the prefix for a collection of data files.

For example, if you use the URI S3://bucketName/prefix, if the prefix is a single file, Amazon Comprehend uses that file as input. If more than one file begins with the prefix, Amazon Comprehend uses all of them as input.

inputDataConfig_documentReaderConfig :: Lens' InputDataConfig (Maybe DocumentReaderConfig) Source #

Provides configuration parameters to override the default actions for extracting text from PDF documents and image files.

inputDataConfig_inputFormat :: Lens' InputDataConfig (Maybe InputFormat) Source #

Specifies how the text in an input file should be processed:

  • ONE_DOC_PER_FILE - Each file is considered a separate document. Use this option when you are processing large documents, such as newspaper articles or scientific papers.
  • ONE_DOC_PER_LINE - Each line in a file is considered a separate document. Use this option when you are processing many short documents, such as text messages.

inputDataConfig_s3Uri :: Lens' InputDataConfig Text Source #

The Amazon S3 URI for the input data. The URI must be in same region as the API endpoint that you are calling. The URI can point to a single input file or it can provide the prefix for a collection of data files.

For example, if you use the URI S3://bucketName/prefix, if the prefix is a single file, Amazon Comprehend uses that file as input. If more than one file begins with the prefix, Amazon Comprehend uses all of them as input.