amazonka-comprehend-2.0: Amazon Comprehend SDK.
Copyright(c) 2013-2023 Brendan Hay
LicenseMozilla Public License, v. 2.0.
MaintainerBrendan Hay
Stabilityauto-generated
Portabilitynon-portable (GHC extensions)
Safe HaskellSafe-Inferred
LanguageHaskell2010

Amazonka.Comprehend.ClassifyDocument

Description

Creates a new document classification request to analyze a single document in real-time, using a previously created and trained custom model and an endpoint.

You can input plain text or you can upload a single-page input document (text, PDF, Word, or image).

If the system detects errors while processing a page in the input document, the API response includes an entry in Errors that describes the errors.

If the system detects a document-level error in your input document, the API returns an InvalidRequestException error response. For details about this exception, see Errors in semi-structured documents in the Comprehend Developer Guide.

Synopsis

Creating a Request

data ClassifyDocument Source #

See: newClassifyDocument smart constructor.

Constructors

ClassifyDocument' 

Fields

  • bytes :: Maybe Base64

    Use the Bytes parameter to input a text, PDF, Word or image file. You can also use the Bytes parameter to input an Amazon Textract DetectDocumentText or AnalyzeDocument output file.

    Provide the input document as a sequence of base64-encoded bytes. If your code uses an Amazon Web Services SDK to classify documents, the SDK may encode the document file bytes for you.

    The maximum length of this field depends on the input document type. For details, see Inputs for real-time custom analysis in the Comprehend Developer Guide.

    If you use the Bytes parameter, do not use the Text parameter.

  • documentReaderConfig :: Maybe DocumentReaderConfig

    Provides configuration parameters to override the default actions for extracting text from PDF documents and image files.

  • text :: Maybe (Sensitive Text)

    The document text to be analyzed. If you enter text using this parameter, do not use the Bytes parameter.

  • endpointArn :: Text

    The Amazon Resource Number (ARN) of the endpoint. For information about endpoints, see Managing endpoints.

Instances

Instances details
ToJSON ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

ToHeaders ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

ToPath ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

ToQuery ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

AWSRequest ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

Associated Types

type AWSResponse ClassifyDocument #

Generic ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

Associated Types

type Rep ClassifyDocument :: Type -> Type #

Show ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

NFData ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

Methods

rnf :: ClassifyDocument -> () #

Eq ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

Hashable ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

type AWSResponse ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

type Rep ClassifyDocument Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

type Rep ClassifyDocument = D1 ('MetaData "ClassifyDocument" "Amazonka.Comprehend.ClassifyDocument" "amazonka-comprehend-2.0-Ko6GCjAQF2RARapSdPn69F" 'False) (C1 ('MetaCons "ClassifyDocument'" 'PrefixI 'True) ((S1 ('MetaSel ('Just "bytes") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe Base64)) :*: S1 ('MetaSel ('Just "documentReaderConfig") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe DocumentReaderConfig))) :*: (S1 ('MetaSel ('Just "text") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe (Sensitive Text))) :*: S1 ('MetaSel ('Just "endpointArn") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 Text))))

newClassifyDocument Source #

Create a value of ClassifyDocument with all optional fields omitted.

Use generic-lens or optics to modify other optional fields.

The following record fields are available, with the corresponding lenses provided for backwards compatibility:

$sel:bytes:ClassifyDocument', classifyDocument_bytes - Use the Bytes parameter to input a text, PDF, Word or image file. You can also use the Bytes parameter to input an Amazon Textract DetectDocumentText or AnalyzeDocument output file.

Provide the input document as a sequence of base64-encoded bytes. If your code uses an Amazon Web Services SDK to classify documents, the SDK may encode the document file bytes for you.

The maximum length of this field depends on the input document type. For details, see Inputs for real-time custom analysis in the Comprehend Developer Guide.

If you use the Bytes parameter, do not use the Text parameter.-- -- Note: This Lens automatically encodes and decodes Base64 data. -- The underlying isomorphism will encode to Base64 representation during -- serialisation, and decode from Base64 representation during deserialisation. -- This Lens accepts and returns only raw unencoded data.

ClassifyDocument, classifyDocument_documentReaderConfig - Provides configuration parameters to override the default actions for extracting text from PDF documents and image files.

ClassifyDocument, classifyDocument_text - The document text to be analyzed. If you enter text using this parameter, do not use the Bytes parameter.

ClassifyDocument, classifyDocument_endpointArn - The Amazon Resource Number (ARN) of the endpoint. For information about endpoints, see Managing endpoints.

Request Lenses

classifyDocument_bytes :: Lens' ClassifyDocument (Maybe ByteString) Source #

Use the Bytes parameter to input a text, PDF, Word or image file. You can also use the Bytes parameter to input an Amazon Textract DetectDocumentText or AnalyzeDocument output file.

Provide the input document as a sequence of base64-encoded bytes. If your code uses an Amazon Web Services SDK to classify documents, the SDK may encode the document file bytes for you.

The maximum length of this field depends on the input document type. For details, see Inputs for real-time custom analysis in the Comprehend Developer Guide.

If you use the Bytes parameter, do not use the Text parameter.-- -- Note: This Lens automatically encodes and decodes Base64 data. -- The underlying isomorphism will encode to Base64 representation during -- serialisation, and decode from Base64 representation during deserialisation. -- This Lens accepts and returns only raw unencoded data.

classifyDocument_documentReaderConfig :: Lens' ClassifyDocument (Maybe DocumentReaderConfig) Source #

Provides configuration parameters to override the default actions for extracting text from PDF documents and image files.

classifyDocument_text :: Lens' ClassifyDocument (Maybe Text) Source #

The document text to be analyzed. If you enter text using this parameter, do not use the Bytes parameter.

classifyDocument_endpointArn :: Lens' ClassifyDocument Text Source #

The Amazon Resource Number (ARN) of the endpoint. For information about endpoints, see Managing endpoints.

Destructuring the Response

data ClassifyDocumentResponse Source #

See: newClassifyDocumentResponse smart constructor.

Constructors

ClassifyDocumentResponse' 

Fields

  • classes :: Maybe [DocumentClass]

    The classes used by the document being analyzed. These are used for multi-class trained models. Individual classes are mutually exclusive and each document is expected to have only a single class assigned to it. For example, an animal can be a dog or a cat, but not both at the same time.

  • documentMetadata :: Maybe DocumentMetadata

    Extraction information about the document. This field is present in the response only if your request includes the Byte parameter.

  • documentType :: Maybe [DocumentTypeListItem]

    The document type for each page in the input document. This field is present in the response only if your request includes the Byte parameter.

  • errors :: Maybe [ErrorsListItem]

    Page-level errors that the system detected while processing the input document. The field is empty if the system encountered no errors.

  • labels :: Maybe [DocumentLabel]

    The labels used the document being analyzed. These are used for multi-label trained models. Individual labels represent different categories that are related in some manner and are not mutually exclusive. For example, a movie can be just an action movie, or it can be an action movie, a science fiction movie, and a comedy, all at the same time.

  • httpStatus :: Int

    The response's http status code.

Instances

Instances details
Generic ClassifyDocumentResponse Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

Associated Types

type Rep ClassifyDocumentResponse :: Type -> Type #

Show ClassifyDocumentResponse Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

NFData ClassifyDocumentResponse Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

Eq ClassifyDocumentResponse Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

type Rep ClassifyDocumentResponse Source # 
Instance details

Defined in Amazonka.Comprehend.ClassifyDocument

type Rep ClassifyDocumentResponse = D1 ('MetaData "ClassifyDocumentResponse" "Amazonka.Comprehend.ClassifyDocument" "amazonka-comprehend-2.0-Ko6GCjAQF2RARapSdPn69F" 'False) (C1 ('MetaCons "ClassifyDocumentResponse'" 'PrefixI 'True) ((S1 ('MetaSel ('Just "classes") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe [DocumentClass])) :*: (S1 ('MetaSel ('Just "documentMetadata") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe DocumentMetadata)) :*: S1 ('MetaSel ('Just "documentType") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe [DocumentTypeListItem])))) :*: (S1 ('MetaSel ('Just "errors") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe [ErrorsListItem])) :*: (S1 ('MetaSel ('Just "labels") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Maybe [DocumentLabel])) :*: S1 ('MetaSel ('Just "httpStatus") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 Int)))))

newClassifyDocumentResponse Source #

Create a value of ClassifyDocumentResponse with all optional fields omitted.

Use generic-lens or optics to modify other optional fields.

The following record fields are available, with the corresponding lenses provided for backwards compatibility:

$sel:classes:ClassifyDocumentResponse', classifyDocumentResponse_classes - The classes used by the document being analyzed. These are used for multi-class trained models. Individual classes are mutually exclusive and each document is expected to have only a single class assigned to it. For example, an animal can be a dog or a cat, but not both at the same time.

$sel:documentMetadata:ClassifyDocumentResponse', classifyDocumentResponse_documentMetadata - Extraction information about the document. This field is present in the response only if your request includes the Byte parameter.

ClassifyDocumentResponse, classifyDocumentResponse_documentType - The document type for each page in the input document. This field is present in the response only if your request includes the Byte parameter.

$sel:errors:ClassifyDocumentResponse', classifyDocumentResponse_errors - Page-level errors that the system detected while processing the input document. The field is empty if the system encountered no errors.

$sel:labels:ClassifyDocumentResponse', classifyDocumentResponse_labels - The labels used the document being analyzed. These are used for multi-label trained models. Individual labels represent different categories that are related in some manner and are not mutually exclusive. For example, a movie can be just an action movie, or it can be an action movie, a science fiction movie, and a comedy, all at the same time.

$sel:httpStatus:ClassifyDocumentResponse', classifyDocumentResponse_httpStatus - The response's http status code.

Response Lenses

classifyDocumentResponse_classes :: Lens' ClassifyDocumentResponse (Maybe [DocumentClass]) Source #

The classes used by the document being analyzed. These are used for multi-class trained models. Individual classes are mutually exclusive and each document is expected to have only a single class assigned to it. For example, an animal can be a dog or a cat, but not both at the same time.

classifyDocumentResponse_documentMetadata :: Lens' ClassifyDocumentResponse (Maybe DocumentMetadata) Source #

Extraction information about the document. This field is present in the response only if your request includes the Byte parameter.

classifyDocumentResponse_documentType :: Lens' ClassifyDocumentResponse (Maybe [DocumentTypeListItem]) Source #

The document type for each page in the input document. This field is present in the response only if your request includes the Byte parameter.

classifyDocumentResponse_errors :: Lens' ClassifyDocumentResponse (Maybe [ErrorsListItem]) Source #

Page-level errors that the system detected while processing the input document. The field is empty if the system encountered no errors.

classifyDocumentResponse_labels :: Lens' ClassifyDocumentResponse (Maybe [DocumentLabel]) Source #

The labels used the document being analyzed. These are used for multi-label trained models. Individual labels represent different categories that are related in some manner and are not mutually exclusive. For example, a movie can be just an action movie, or it can be an action movie, a science fiction movie, and a comedy, all at the same time.