When running the script as described in step 6, one will get the following error:
{
"error": {
"code": 400,
"message": "Invalid JSON payload received. Unknown name \"entity_inference_enabled\" at 'data_scan.data_discovery_spec.storage_config.unstructured_data_options': Cannot find field.",
"status": "INVALID_ARGUMENT",
"details": [
{
"@type": "type.googleapis.com/google.rpc.BadRequest",
"fieldViolations": [
{
"field": "data_scan.data_discovery_spec.storage_config.unstructured_data_options",
"description": "Invalid JSON payload received. Unknown name \"entity_inference_enabled\" at 'data_scan.data_discovery_spec.storage_config.unstructured_data_options': Cannot find field."
}
]
}
]
}
}
There are 2 problems:
- The API expects camel case, so it should be "onDemand" instead of "on_demand".
- "entity_inference_enabled" is also not a camel case, but replacing it with "entityInferenceEnabled" will not work, since the key is replaced with "semanticInferenceEnabled".
So the script in step 6 should be:
# 1. Set your variables
PROJECT_ID="<PROJECT_ID>"
REGION="<REGION>"
ENV_SUFFIX="stg1"
DATASCAN_ID="froyo-data-${ENV_SUFFIX}"
BUCKET_NAME="<BUCKET_NAME>"
# 2. Set this to the Name of the connection you created in Step 7
CONNECTION_ID="<CONNECTION_ID_NAME>"
# 3. Define the API Endpoint
DATAPLEX_API="dataplex.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION}"
# 4. Create the DataScan via CURL
echo "Creating Dataplex DataScan: ${DATASCAN_ID}..."
curl -X POST "https://$DATAPLEX_API/dataScans?dataScanId=${DATASCAN_ID}" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-d '{
"data": {
"resource": "//storage.googleapis.com/projects/'"${PROJECT_ID}"'/buckets/'"${BUCKET_NAME}"'"
},
"executionSpec": {
"trigger": {
"onDemand": {}
}
},
"dataDiscoverySpec": {
"bigqueryPublishingConfig": {
"tableType": "BIGLAKE",
"connection": "projects/'"${PROJECT_ID}"'/locations/'"${REGION}"'/connections/'"${CONNECTION_ID}"'"
},
"storageConfig": {
"unstructuredDataOptions": {
"semanticInferenceEnabled": true
}
}
}
}'
When running the script as described in step 6, one will get the following error:
There are 2 problems:
So the script in step 6 should be: