Service Configuration for Google Cloud
Environment variables
Define the following environment variables.
Must have:
name | value | description | sensitive? | source |
---|---|---|---|---|
SPRING_PROFILES_ACTIVE |
ex gcp
|
Spring profile that activate default configuration for Google Cloud environment | false | - |
<ELASTICSEARCH_USER_ENV_VARIABLE_NAME> |
ex user
|
Elasticsearch user, name of that variable not defined at the service level, the name will be received through partition service. Each tenant can have it's own ENV name value, and it must be present in ENV of Indexer service, see Partition properties set | yes | - |
<ELASTICSEARCH_PASSWORD_ENV_VARIABLE_NAME> |
ex password
|
Elasticsearch password, name of that variable not defined at the service level, the name will be received through partition service. Each tenant can have it's own ENV name value, and it must be present in ENV of Indexer service, see Partition properties set | false | - |
Defined in default application property file but possible to override:
name | value | description | sensitive? | source |
---|---|---|---|---|
LOG_PREFIX |
service |
Logging prefix | no | - |
LOG_LEVEL |
**** |
Logging level | no | - |
SECURITY_HTTPS_CERTIFICATE_TRUST |
ex false
|
Elastic client connection uses TrustSelfSignedStrategy(), if it is 'true' | false | output of infrastructure deployment |
REDIS_SEARCH_HOST |
ex 127.0.0.1
|
Redis host | no | |
REDIS_SEARCH_PORT |
ex 6379
|
Redis host port | no | |
REDIS_SEARCH_PASSWORD |
ex *****
|
Redis host password | yes | |
REDIS_SEARCH_WITH_SSL |
ex true or false
|
Redis host ssl config | no | |
REDIS_SEARCH_EXPIRATION |
ex 30
|
Redis cache expiration in seconds | no | |
PARTITION_HOST |
ex https://partition.com
|
Partition host | no | output of infrastructure deployment |
ENTITLEMENTS_HOST |
ex https://entitlements.com
|
Entitlements host | no | output of infrastructure deployment |
STORAGE_HOST |
ex https://storage.com
|
Storage host | no | output of infrastructure deployment |
INDEXER_QUEUE_HOST |
ex http://indexer-queue/api/indexer-queue/v1/_dps/task-handlers/enqueue
|
Indexer-Queue host endpoint used for reprocessing tasks | no | output of infrastructure deployment |
SCHEMA_BASE_HOST |
ex https://schema.com
|
Schema service host | no | output of infrastructure deployment |
GOOGLE_APPLICATION_CREDENTIALS |
ex /path/to/directory/service-key.json
|
Service account credentials, you only need this if running locally | yes | https://console.cloud.google.com/iam-admin/serviceaccounts |
These variables define service behavior, and are used to switch between Reference
or Google Cloud
environments, their overriding and usage in mixed mode was not tested.
Usage of spring profiles is preferred.
name | value | description | sensitive? | source |
---|---|---|---|---|
PARTITION_AUTH_ENABLED |
ex true or false
|
Disable or enable auth token provisioning for requests to Partition service | no | - |
OQMDRIVER |
rabbitmq or pubsub
|
Oqm driver mode that defines which message broker will be used | no | - |
SERVICE_TOKEN_PROVIDER |
GCP or OPENID
|
Service account token provider, GCP means use Google service account OPEIND means use OpenId provider like Keycloak
|
no | - |
Pubsub configuration
Pubsub should have topics and subscribers with names and configs:
TOPIC NAME | Subscription name | Subscription config |
---|---|---|
indexing-progress | (Consumer not implemented) | (Consumer not implemented) |
records-changed | indexer-records-changed |
Maximum delivery attempts: 10 Retry policy: Retry after exponential backoff delay Minimum backoff duration: 0 seconds Maximum backoff duration: 30 seconds Grant forwarding permissions for dead letter
|
records-changed-dead-letter | (Consumer not implemented) | (Consumer not implemented) |
reprocess | indexer-reprocess |
Maximum delivery attempts: 5 Retry policy: Retry after exponential backoff delay Minimum backoff duration: 10 seconds Maximum backoff duration: 600 seconds Grant forwarding permissions for dead letter
|
reprocess-dead-letter | (Consumer not implemented) | (Consumer not implemented) |
schema-changed | indexer-schema-changed |
Maximum delivery attempts: 5 Retry policy: Retry after exponential backoff delay Minimum backoff duration: 10 seconds Maximum backoff duration: 600 seconds Grant forwarding permissions for dead letter
|
schema-changed-dead-letter | (Consumer not implemented) | (Consumer not implemented) |
Additional throughput configuration for PubSub subscription consumer via Partition service
It is possible, but not necessary to adjust consumer throughput via Partition service, there are 3 levels of consumers:
MIN - for mildly consumers, defaults(streams = 1, threads = 2, outstanding elements = 20) MID - for consumers with the average load, defaults(streams = 2, threads = 2, outstanding elements = 40) MAX - for maximum loaded consumers, defaults(streams = 2, threads = 5, outstanding elements = 100)
"max.sub.parallel.streams": {
"sensitive": false,
"value": 2
},
"max.sub.thread.per.stream": {
"sensitive": false,
"value": 5
},
"max.sub.max.outstanding.elements": {
"sensitive": true,
"value": 100
}
Properties set in Partition service
Note that properties can be set in Partition as sensitive
in that case in property value
should be present not value itself, but ENV variable name.
This variable should be present in environment of service that need that variable.
Example:
"elasticsearch.port": {
"sensitive": false, <- value not sensitive
"value": "9243" <- will be used as is.
},
"elasticsearch.password": {
"sensitive": true, <- value is sensitive
"value": "ELASTIC_SEARCH_PASSWORD_OSDU" <- service consumer should have env variable ELASTIC_SEARCH_PASSWORD_OSDU with elastic search password
}
There is no hardcode in services, all behaviour defined by sensitivity of property.
Elasticsearch configuration
prefix: elasticsearch
It can be overridden by:
- through the Spring Boot property
elastic-search-properties-prefix
- environment variable
ELASTIC_SEARCH_PROPERTIES_PREFIX
Propertyset:
Property | Description |
---|---|
elasticsearch.host | server URL |
elasticsearch.port | server port |
elasticsearch.user | username |
elasticsearch.password | password |
Example of a definition for a single tenant
curl -L -X PATCH 'http://partition.com/api/partition/v1/partitions/opendes' -H 'data-partition-id: opendes' -H 'Authorization: Bearer ...' -H 'Content-Type: application/json' --data-raw '{
"properties": {
"elasticsearch.host": {
"sensitive": false,
"value": "elastic.us-central1.gc.cloud.es.io"
},
"elasticsearch.port": {
"sensitive": false,
"value": "9243"
},
"elasticsearch.user": {
"sensitive": true,
"value": "<USER_ENV_VARIABLE_NAME>" <- (Not actual value, just name of env variable)
},
"elasticsearch.password": {
"sensitive": true,
"value": "<PASSWORD_ENV_VARIABLE_NAME>" <- (Not actual value, just name of env variable)
}
}
}'
Google cloud service account configuration
TBD
Required roles |
---|
- |
Running E2E Tests
You will need to have the following environment variables defined.
name | value | description | sensitive? | source |
---|---|---|---|---|
ELASTIC_PASSWORD |
******** |
Password for Elasticsearch | yes | output of infrastructure deployment |
ELASTIC_USER_NAME |
******** |
User name for Elasticsearch | yes | output of infrastructure deployment |
ELASTIC_HOST |
ex elastic.domain.com
|
Host Elasticsearch | yes | output of infrastructure deployment |
ELASTIC_PORT |
ex 9243
|
Port Elasticsearch | yes | output of infrastructure deployment |
GCLOUD_PROJECT |
ex opendes
|
Google Cloud Project Id | no | output of infrastructure deployment |
INDEXER_HOST |
ex https://os-indexer-dot-opendes.appspot.com/api/indexer/v2/
|
Indexer API endpoint | no | output of infrastructure deployment |
ENTITLEMENTS_DOMAIN |
ex opendes-gc.projects.com
|
OSDU R2 to run tests under | no | - |
OTHER_RELEVANT_DATA_COUNTRIES |
ex US
|
valid legal tag with a other relevant data countries | no | - |
LEGAL_TAG |
ex opendes-demo-legaltag
|
valid legal tag with a other relevant data countries from DEFAULT_OTHER_RELEVANT_DATA_COUNTRIES
|
no | - |
DEFAULT_DATA_PARTITION_ID_TENANT1 |
ex opendes
|
HTTP Header 'Data-Partition-ID' | no | - |
DEFAULT_DATA_PARTITION_ID_TENANT2 |
ex opendes
|
HTTP Header 'Data-Partition-ID' | no | - |
SEARCH_INTEGRATION_TESTER |
******** |
Service account for API calls. Note: this user must have entitlements configured already | yes | https://console.cloud.google.com/iam-admin/serviceaccounts |
SEARCH_HOST |
ex http://localhost:8080/api/search/v2/
|
Endpoint of search service | no | - |
STORAGE_HOST |
ex http://os-storage-dot-opendes.appspot.com/api/storage/v2/
|
Storage API endpoint | no | output of infrastructure deployment |
SECURITY_HTTPS_CERTIFICATE_TRUST |
ex false
|
Elastic client connection uses TrustSelfSignedStrategy(), if it is 'true' | false | output of infrastructure deployment |
Entitlements configuration for integration accounts
INTEGRATION_TESTER | NO_DATA_ACCESS_TESTER |
---|---|
users users.datalake.ops service.storage.creator service.entitlements.user service.search.user service.search.admin data.test1 data.integration.test users@{tenant1}@{domain}.com |
Execute following command to build code and run all the integration tests:
# Note: this assumes that the environment variables for integration tests as outlined
# above are already exported in your environment.
$ (cd testing/indexer-test-gc/ && mvn clean test)