Copyright | (c) 2013-2015 Brendan Hay |
---|---|
License | Mozilla Public License, v. 2.0. |
Maintainer | Brendan Hay <brendan.g.hay@gmail.com> |
Stability | auto-generated |
Portability | non-portable (GHC extensions) |
Safe Haskell | None |
Language | Haskell2010 |
Creates a DataSource
from
Amazon Redshift. A DataSource
references data that can be used to perform either CreateMLModel,
CreateEvaluation or CreateBatchPrediction operations.
CreateDataSourceFromRedshift
is an asynchronous operation. In response
to CreateDataSourceFromRedshift
, Amazon Machine Learning (Amazon ML)
immediately returns and sets the DataSource
status to PENDING
. After
the DataSource
is created and ready for use, Amazon ML sets the
Status
parameter to COMPLETED
. DataSource
in COMPLETED
or
PENDING
status can only be used to perform CreateMLModel,
CreateEvaluation, or CreateBatchPrediction operations.
If Amazon ML cannot accept the input source, it sets the Status
parameter to FAILED
and includes an error message in the Message
attribute of the GetDataSource operation response.
The observations should exist in the database hosted on an Amazon
Redshift cluster and should be specified by a SelectSqlQuery
. Amazon
ML executes
Unload
command in Amazon Redshift to transfer the result set of
SelectSqlQuery
to 'S3StagingLocation.'
After the DataSource
is created, it's ready for use in evaluations
and batch predictions. If you plan to use the DataSource
to train an
MLModel
, the DataSource
requires another item -- a recipe. A recipe
describes the observation variables that participate in training an
MLModel
. A recipe describes how each input variable will be used in
training. Will the variable be included or excluded from training? Will
the variable be manipulated, for example, combined with another variable
or split apart into word combinations? The recipe provides answers to
these questions. For more information, see the Amazon Machine Learning
Developer Guide.
See: AWS API Reference for CreateDataSourceFromRedshift.
- createDataSourceFromRedshift :: Text -> RedshiftDataSpec -> Text -> CreateDataSourceFromRedshift
- data CreateDataSourceFromRedshift
- cdsfrDataSourceName :: Lens' CreateDataSourceFromRedshift (Maybe Text)
- cdsfrComputeStatistics :: Lens' CreateDataSourceFromRedshift (Maybe Bool)
- cdsfrDataSourceId :: Lens' CreateDataSourceFromRedshift Text
- cdsfrDataSpec :: Lens' CreateDataSourceFromRedshift RedshiftDataSpec
- cdsfrRoleARN :: Lens' CreateDataSourceFromRedshift Text
- createDataSourceFromRedshiftResponse :: Int -> CreateDataSourceFromRedshiftResponse
- data CreateDataSourceFromRedshiftResponse
- cdsfrrsDataSourceId :: Lens' CreateDataSourceFromRedshiftResponse (Maybe Text)
- cdsfrrsResponseStatus :: Lens' CreateDataSourceFromRedshiftResponse Int
Creating a Request
createDataSourceFromRedshift Source
Creates a value of CreateDataSourceFromRedshift
with the minimum fields required to make a request.
Use one of the following lenses to modify other fields as desired:
data CreateDataSourceFromRedshift Source
See: createDataSourceFromRedshift
smart constructor.
Request Lenses
cdsfrDataSourceName :: Lens' CreateDataSourceFromRedshift (Maybe Text) Source
A user-supplied name or description of the DataSource
.
cdsfrComputeStatistics :: Lens' CreateDataSourceFromRedshift (Maybe Bool) Source
The compute statistics for a DataSource
. The statistics are generated
from the observation data referenced by a DataSource
. Amazon ML uses
the statistics internally during MLModel
training. This parameter must
be set to true
if the 'DataSource'
needs to be used for MLModel
training
cdsfrDataSourceId :: Lens' CreateDataSourceFromRedshift Text Source
A user-supplied ID that uniquely identifies the DataSource
.
cdsfrDataSpec :: Lens' CreateDataSourceFromRedshift RedshiftDataSpec Source
The data specification of an Amazon Redshift DataSource
:
- DatabaseInformation -
- 'DatabaseName ' - Name of the Amazon Redshift database.
- ' ClusterIdentifier ' - Unique ID for the Amazon Redshift cluster. - DatabaseCredentials - AWS Identity abd Access Management (IAM) credentials that are used to connect to the Amazon Redshift database.
- SelectSqlQuery - Query that is used to retrieve the observation data
for the
Datasource
. - S3StagingLocation - Amazon Simple Storage Service (Amazon S3)
location for staging Amazon Redshift data. The data retrieved from
Amazon Relational Database Service (Amazon RDS) using
SelectSqlQuery
is stored in this location. - DataSchemaUri - Amazon S3 location of the
DataSchema
. - DataSchema - A JSON string representing the schema. This is not
required if
DataSchemaUri
is specified. DataRearrangement - A JSON string representing the splitting requirement of a
Datasource
.Sample - ' "{\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"'
cdsfrRoleARN :: Lens' CreateDataSourceFromRedshift Text Source
A fully specified role Amazon Resource Name (ARN). Amazon ML assumes the role on behalf of the user to create the following:
- A security group to allow Amazon ML to execute the
SelectSqlQuery
query on an Amazon Redshift cluster - An Amazon S3 bucket policy to grant Amazon ML read/write
permissions on the
S3StagingLocation
Destructuring the Response
createDataSourceFromRedshiftResponse Source
Creates a value of CreateDataSourceFromRedshiftResponse
with the minimum fields required to make a request.
Use one of the following lenses to modify other fields as desired:
data CreateDataSourceFromRedshiftResponse Source
Represents the output of a CreateDataSourceFromRedshift operation, and is an acknowledgement that Amazon ML received the request.
The CreateDataSourceFromRedshift operation is asynchronous. You can poll
for updates by using the GetBatchPrediction operation and checking the
Status
parameter.
See: createDataSourceFromRedshiftResponse
smart constructor.
Response Lenses
cdsfrrsDataSourceId :: Lens' CreateDataSourceFromRedshiftResponse (Maybe Text) Source
A user-supplied ID that uniquely identifies the datasource. This value
should be identical to the value of the DataSourceID
in the request.
cdsfrrsResponseStatus :: Lens' CreateDataSourceFromRedshiftResponse Int Source
The response status code.