Fast.BI platform provides four different operators for running data transformation pipelines, each optimized for specific use cases and requirements. Below is a detailed comparison of each operator type.
Description:
The default operator for running data transformation pipelines in Fast.BI platform.
Key Characteristics:
Best Used For:
Trade-offs:
Pros | Cons |
---|---|
Most cost-effective | Slower execution speed |
Excellent horizontal scaling | Resource startup overhead |
Good resource isolation | Higher latency per task |
Flexible deployment options | Additional pod creation time |
Description:
Executes data pipelines directly within Data Orchestrator (Airflow) workers.
Key Characteristics:
Best Used For:
Trade-offs:
Pros | Cons |
---|---|
Faster execution than K8S | Limited by worker resources |
No pod creation overhead | No horizontal scaling |
Simplified architecture | Requires Airflow resource planning |
Lower latency | Potential resource contention |
Description:
Runs data pipelines on dedicated project machines with pre-configured API servers.
Key Characteristics:
Best Used For:
Trade-offs:
Pros | Cons |
---|---|
Fastest execution speed | Highest cost |
No startup overhead | Always-on resources |
Horizontal scaling per node | Dedicated infrastructure required |
Immediate task execution | Resource underutilization possible |
Description:
Creates isolated external Google Kubernetes Engine clusters for workload execution.
Key Characteristics:
Best Used For:
Trade-offs:
Pros | Cons |
---|---|
Complete isolation | Higher operational complexity |
Clear cost attribution | Cluster creation overhead |
External workload support | Additional GCP costs |
Flexible resource allocation | Longer startup times |
Requirement | Recommended Operator |
---|---|
Cost Optimization | K8S Operator |
Performance | API Operator |
Balanced Cost/Speed | Bash Operator |
Isolation | GKE Operator |
High Concurrency | K8S Operator |
Quick Execution | API Operator |
External Workloads | GKE Operator |
Resource Efficiency | Bash Operator |
Core variables define the fundamental settings for your Fast.BI data platform configuration. These settings are essential regardless of the operator type being used.
Variable | Description | Default Value | Possible Values |
---|---|---|---|
PLATFORM |
Defines the data orchestration platform | Airflow |
Airflow , Composer |
DAG_OWNER |
Specifies the owner of the project | Data-Orchestrator Workflows |
Any string value |
NAMESPACE |
Defines the execution namespace | data-orchestration |
Any valid Kubernetes namespace |
OPERATOR |
Specifies the execution operator type | Depends on configuration | k8s , bash , api , gke |
POD_NAME |
Sets the pod name prefix for Kubernetes | Depends on operator | Any valid Kubernetes pod name |
Variable | Description | Default Value | Notes |
---|---|---|---|
GIT_URL |
Base URL for repository access | https://gitlab.fast.bi/ |
Must include protocol and trailing slash |
DAG_ID |
Unique identifier for the DAG | Format: {operator}_{project_name} |
Must be unique across platform |
DAG_SCHEDULE_INTERVAL |
Pipeline execution schedule | @once |
Cron expression or Airflow preset |
DAG_TAG |
Pipeline classification tags | Based on operator | Array of relevant tags |
Variable | Description | Default Value | Required |
---|---|---|---|
PROJECT_ID |
Google Cloud project identifier | None | Yes |
PROJECT_LEVEL |
Environment classification | None | Yes |
IMAGE |
DBT core image version | Latest Fast.BI image | No |
DBT_PROJECT_NAME |
Project identifier | None | Yes |
DBT_PROJECT_DIRECTORY |
Project location in repository | Same as project name | No |
MANIFEST_NAME |
Parser manifest identifier | {project_name}_manifest |
No |
These flags control various aspects of the data pipeline execution and functionality.
Variable | Description | Default | Values |
---|---|---|---|
DBT_SEED |
Enable seed data loading | False |
True /False |
DBT_SEED_SHARDING |
Individual seed file tasks | False |
True /False |
DBT_SOURCE |
Enable source loading | False |
True /False |
DBT_SOURCE_SHARDING |
Individual source tasks | False |
True /False |
DBT_SNAPSHOT |
Enable snapshot creation | False |
True /False |
DBT_SNAPSHOT_SHARDING |
Individual snapshot tasks | False |
True /False |
Variable | Description | Default | Values |
---|---|---|---|
DEBUG |
Enable connection verification | False |
True /False |
MODEL_DEBUG_LOG |
Enable extended logging | False |
True /False |
Controls for various platform integrations and additional services.
Variable | Description | Default | Notes |
---|---|---|---|
DATA_QUALITY |
Enable quality service | False |
Enables quality checks |
DATAHUB_ENABLED |
Enable data governance | False |
Enables metadata collection |
DATA_ANALYSIS_PROJECT |
BI service metadata sharing | None | Optional project identifier |
Variable | Description | Default | Required For |
---|---|---|---|
AIRBYTE_CONNECTION_ID |
Replication connection ID | None | Data replication |
AIRBYTE_REPLICATION_FLAG |
Enable managed replication | False |
Orchestration control |
AIRBYTE_WORKSPACE_ID |
Replication environment ID | None | Airbyte integration |
Some variables have dependencies on others. Here are the key relationships:
Replication Configuration
AIRBYTE_REPLICATION_FLAG
is True
:
AIRBYTE_CONNECTION_ID
is requiredAIRBYTE_WORKSPACE_ID
is requiredSharding Configuration
*_SHARDING
flag is True
:
True
DBT_SEED_SHARDING
requires DBT_SEED
Data Quality
DATA_QUALITY
is True
:
Naming Conventions
Schedule Intervals
@daily
, @weekly
)Resource Management
The K8S (Kubernetes) Operator is the default operator in Fast.BI platform. This section details the specific configuration variables used when deploying data pipelines with the K8S operator.
All K8S operator variables are contained within a configuration block identified by:
K8S_SECRETS_DBT_PRJ_[PROJECT_NAME]:
Where [PROJECT_NAME]
is your project name in uppercase.
Variable | Required | Description | Notes |
---|---|---|---|
OPERATOR |
Yes | Must be set to k8s |
Defines operator type |
POD_NAME |
Yes | Base name for Kubernetes pods | Prefix for all created pods |
NAMESPACE |
Yes | Kubernetes namespace | Where pods will be created |
Variable | Required | Description | Default Value |
---|---|---|---|
PLATFORM |
Yes | Orchestration platform identifier | Airflow |
DAG_OWNER |
Yes | Owner of the data pipeline | Data-Orchestrator Workflows |
DAG_ID |
Yes | Format: k8s_operator_[project_name] |
Auto-generated |
DAG_SCHEDULE_INTERVAL |
Yes | Pipeline schedule | @once |
DAG_TAG |
No | Pipeline classification tags | k8s_operator_dbt |
Variable | Required | Description | Notes |
---|---|---|---|
PROJECT_ID |
Yes | Google Cloud project identifier | Must be valid GCP ID |
PROJECT_LEVEL |
Yes | Environment classification | e.g., production , development |
GIT_URL |
Yes | Repository base URL | Must include protocol |
IMAGE |
Yes | DBT core image reference | Full container image path |
Variable | Required | Description | Notes |
---|---|---|---|
DBT_PROJECT_NAME |
Yes | Project identifier | Must be unique |
DBT_PROJECT_DIRECTORY |
Yes | Project location in repo | Usually same as project name |
MANIFEST_NAME |
Yes | Manifest file identifier | Format: [project_name]_manifest |
Variable | Required | Description | Default |
---|---|---|---|
DBT_SEED |
No | Enable seed data loading | False |
DBT_SEED_SHARDING |
No | Individual seed file tasks | False |
DBT_SOURCE |
No | Enable source loading | False |
DBT_SOURCE_SHARDING |
No | Individual source tasks | False |
DBT_SNAPSHOT |
No | Enable snapshot creation | False |
DBT_SNAPSHOT_SHARDING |
No | Individual snapshot tasks | False |
Variable | Required | Description | Default |
---|---|---|---|
DEBUG |
No | Enable connection verification | False |
MODEL_DEBUG_LOG |
No | Enable extended logging | False |
Variable | Required | Description | Default |
---|---|---|---|
DATA_QUALITY |
No | Enable quality service | False |
DATAHUB_ENABLED |
No | Enable data governance | False |
DATA_ANALYSIS_PROJECT |
No | BI service metadata sharing | Not set |
Variable | Required | Description | Conditions |
---|---|---|---|
AIRBYTE_CONNECTION_ID |
No | Replication connection ID | Required if using Airbyte |
AIRBYTE_REPLICATION_FLAG |
No | Enable managed replication | Required if using Airbyte |
AIRBYTE_WORKSPACE_ID |
No | Replication environment ID | Required if using Airbyte |
Pod Naming
Namespace Strategy
Resource Planning
Performance Optimization
The Bash Operator executes data pipelines directly within Airflow workers, offering faster execution times compared to the K8S operator. This section details the configuration variables specific to the Bash operator deployment.
Bash operator variables are contained within a configuration block identified by:
BASH_SECRETS_DBT_PRJ_[PROJECT_NAME]:
Where [PROJECT_NAME]
is your project name in uppercase.
Variable | Required | Description | Notes |
---|---|---|---|
OPERATOR |
Yes | Must be set to bash |
Defines operator type |
NAMESPACE |
Yes | Airflow worker namespace | Used for resource organization |
Variable | Required | Description | Default Value |
---|---|---|---|
PLATFORM |
Yes | Orchestration platform identifier | Airflow |
DAG_OWNER |
Yes | Owner of the data pipeline | Data-Orchestrator Workflows |
DAG_ID |
Yes | Format: bash_operator_[project_name] |
Auto-generated |
DAG_SCHEDULE_INTERVAL |
Yes | Pipeline schedule | @once |
DAG_TAG |
No | Pipeline classification tags | bash_operator_dbt |
Variable | Required | Description | Notes |
---|---|---|---|
PROJECT_ID |
Yes | Google Cloud project identifier | Must be valid GCP ID |
PROJECT_LEVEL |
Yes | Environment classification | e.g., production , development |
GIT_URL |
Yes | Repository base URL | Must include protocol |
IMAGE |
Yes | DBT core image reference | Used for version control |
Variable | Required | Description | Notes |
---|---|---|---|
DBT_PROJECT_NAME |
Yes | Project identifier | Must be unique |
DBT_PROJECT_DIRECTORY |
Yes | Project location in repo | Usually same as project name |
MANIFEST_NAME |
Yes | Manifest file identifier | Format: [project_name]_manifest |
Variable | Required | Description | Default |
---|---|---|---|
DBT_SEED |
No | Enable seed data loading | False |
DBT_SEED_SHARDING |
No | Individual seed file tasks | False |
DBT_SOURCE |
No | Enable source loading | False |
DBT_SOURCE_SHARDING |
No | Individual source tasks | False |
DBT_SNAPSHOT |
No | Enable snapshot creation | False |
DBT_SNAPSHOT_SHARDING |
No | Individual snapshot tasks | False |
Variable | Required | Description | Default |
---|---|---|---|
DEBUG |
No | Enable connection verification | False |
MODEL_DEBUG_LOG |
No | Enable extended logging | False |
Variable | Required | Description | Default |
---|---|---|---|
DATA_QUALITY |
No | Enable quality service | False |
DATAHUB_ENABLED |
No | Enable data governance | False |
DATA_ANALYSIS_PROJECT |
No | BI service metadata sharing | Not set |
Variable | Required | Description | Conditions |
---|---|---|---|
AIRBYTE_CONNECTION_ID |
No | Replication connection ID | Required if using Airbyte |
AIRBYTE_REPLICATION_FLAG |
No | Enable managed replication | Required if using Airbyte |
AIRBYTE_WORKSPACE_ID |
No | Replication environment ID | Required if using Airbyte |
Resource Planning
Environment Management
Performance Optimization
Security Considerations
Execution Environment
Limitations
Monitoring Requirements
The API Operator represents Fast.BI's high-performance solution, running data pipelines on dedicated project machines through API endpoints. This operator provides the fastest execution times and is ideal for time-sensitive or resource-intensive workloads.
API operator variables are contained within a configuration block identified by:
API_SECRETS_DBT_PRJ_[PROJECT_NAME]:
Where [PROJECT_NAME]
is your project name in uppercase.
Variable | Required | Description | Notes |
---|---|---|---|
OPERATOR |
Yes | Must be set to api |
Defines operator type |
NAMESPACE |
Yes | Default: dbt-server |
API server namespace |
Variable | Required | Description | Default Value |
---|---|---|---|
PLATFORM |
Yes | Orchestration platform identifier | Airflow |
DAG_OWNER |
Yes | Owner of the data pipeline | Data-Orchestrator Workflows |
DAG_ID |
Yes | Format: api_operator_[project_name] |
Auto-generated |
DAG_SCHEDULE_INTERVAL |
Yes | Pipeline schedule | @once |
DAG_TAG |
No | Pipeline classification tags | api_operator_dbt |
Variable | Required | Description | Notes |
---|---|---|---|
PROJECT_ID |
Yes | Google Cloud project identifier | Must be valid GCP ID |
PROJECT_LEVEL |
Yes | Environment classification | e.g., production , development |
GIT_URL |
Yes | Repository base URL | Must include protocol |
IMAGE |
Yes | DBT core image reference | Used for API server deployment |
Variable | Required | Description | Notes |
---|---|---|---|
DBT_PROJECT_NAME |
Yes | Project identifier | Must be unique |
DBT_PROJECT_DIRECTORY |
Yes | Project location in repo | Usually same as project name |
MANIFEST_NAME |
Yes | Manifest file identifier | Format: [project_name]_manifest |
Variable | Required | Description | Default |
---|---|---|---|
DBT_SEED |
No | Enable seed data loading | False |
DBT_SEED_SHARDING |
No | Individual seed file tasks | False |
DBT_SOURCE |
No | Enable source loading | False |
DBT_SOURCE_SHARDING |
No | Individual source tasks | False |
DBT_SNAPSHOT |
No | Enable snapshot creation | False |
DBT_SNAPSHOT_SHARDING |
No | Individual snapshot tasks | False |
Variable | Required | Description | Default |
---|---|---|---|
DEBUG |
No | Enable connection verification | False |
MODEL_DEBUG_LOG |
No | Enable extended logging | False |
Variable | Required | Description | Default |
---|---|---|---|
DATA_QUALITY |
No | Enable quality service | False |
DATAHUB_ENABLED |
No | Enable data governance | False |
DATA_ANALYSIS_PROJECT |
No | BI service metadata sharing | Not set |
Variable | Required | Description | Conditions |
---|---|---|---|
AIRBYTE_CONNECTION_ID |
No | Replication connection ID | Required if using Airbyte |
AIRBYTE_REPLICATION_FLAG |
No | Enable managed replication | Required if using Airbyte |
AIRBYTE_WORKSPACE_ID |
No | Replication environment ID | Required if using Airbyte |
API Server Management
Resource Optimization
Performance Monitoring
Security Considerations
API Environment
Advantages
Trade-offs
Monitoring Requirements
The GKE (Google Kubernetes Engine) Operator provides isolated workload execution in dedicated Google Cloud clusters. This operator creates and manages separate GKE clusters for data pipeline execution, offering complete isolation and clear cost attribution.
GKE operator variables are contained within a configuration block identified by:
GKE_SECRETS_DBT_PRJ_[PROJECT_NAME]:
Where [PROJECT_NAME]
is your project name in uppercase.
Variable | Required | Description | Default Value |
---|---|---|---|
CLUSTER_NAME |
Yes | Name of the GKE cluster | dbt-bi-platform-workload |
CLUSTER_ZONE |
Yes | GCP zone for cluster deployment | europe-central2 |
CLUSTER_NODE_COUNT |
Yes | Number of nodes in cluster | 3 |
CLUSTER_MACHINE_TYPE |
Yes | GCP machine type | e2-highcpu-4 |
CLUSTER_MACHINE_DISK_TYPE |
Yes | Type of disk for nodes | pd-ssd |
Variable | Required | Description | Default |
---|---|---|---|
NETWORK |
Yes | VPC network reference | Will be updated... |
SUBNETWORK |
Yes | VPC subnetwork reference | Will be updated... |
PRIVATENODES_IP |
Yes | IP range for private nodes | 10.201.97.128/28 |
SHARED_VPC |
No | Enable shared VPC | false |
SERVICES_SECONDARY_RANGE_NAME |
Yes | Secondary range for services | Will be updated... |
CLUSTER_SECONDARY_RANGE_NAME |
Yes | Secondary range for cluster | Will be updated... |
Variable | Required | Description | Default Value |
---|---|---|---|
PLATFORM |
Yes | Orchestration platform | Airflow |
NAMESPACE |
Yes | Kubernetes namespace | default |
OPERATOR |
Yes | Must be set to gke |
gke |
POD_NAME |
Yes | Base name for pods | dbt-gke |
Variable | Required | Description | Notes |
---|---|---|---|
PROJECT_ID |
Yes | Google Cloud project ID | Must be valid GCP ID |
PROJECT_LEVEL |
Yes | Environment classification | e.g., production |
GIT_URL |
Yes | Repository base URL | Must include protocol |
IMAGE |
Yes | DBT core image reference | Full container image path |
Variable | Required | Description | Default |
---|---|---|---|
DAG_ID |
Yes | Format: gke_operator_[project_name] |
Auto-generated |
DAG_SCHEDULE_INTERVAL |
Yes | Pipeline schedule | @once |
DAG_OWNER |
Yes | Owner of the pipeline | Data-Orchestrator Workflows |
DAG_TAG |
No | Pipeline classification tags | gke_operator_dbt |
Variable | Required | Description | Notes |
---|---|---|---|
DBT_PROJECT_NAME |
Yes | Project identifier | Must be unique |
DBT_PROJECT_DIRECTORY |
Yes | Project location in repo | Usually same as project name |
MANIFEST_NAME |
Yes | Manifest file identifier | Format: [project_name]_manifest |
Variable | Required | Description | Default |
---|---|---|---|
DBT_SEED |
No | Enable seed data loading | False |
DBT_SEED_SHARDING |
No | Individual seed file tasks | False |
DBT_SOURCE |
No | Enable source loading | False |
DBT_SOURCE_SHARDING |
No | Individual source tasks | False |
DBT_SNAPSHOT |
No | Enable snapshot creation | False |
DBT_SNAPSHOT_SHARDING |
No | Individual snapshot tasks | False |
Variable | Required | Description | Default |
---|---|---|---|
DATA_QUALITY |
No | Enable quality service | False |
DATAHUB_ENABLED |
No | Enable data governance | False |
DATA_ANALYSIS_PROJECT |
No | BI service metadata sharing | Not set |
Variable | Required | Description | Conditions |
---|---|---|---|
AIRBYTE_CONNECTION_ID |
No | Replication connection ID | Required if using Airbyte |
AIRBYTE_REPLICATION_FLAG |
No | Enable managed replication | Required if using Airbyte |
AIRBYTE_WORKSPACE_ID |
No | Replication environment ID | Required if using Airbyte |
Cluster Configuration
Network Planning
Resource Management
Security Configuration
Cluster Lifecycle
Network Requirements
Monitoring Needs
K8S_SECRETS_DBT_PRJ_SALES:
PLATFORM: 'Airflow'
DAG_OWNER: 'Data Team'
NAMESPACE: 'data-orchestration'
OPERATOR: 'k8s'
POD_NAME: 'dbt-k8s'
PROJECT_ID: 'my-gcp-project'
PROJECT_LEVEL: 'production'
DBT_PROJECT_NAME: 'sales_analytics'
DAG_SCHEDULE_INTERVAL: '@daily'
DAG_TAG:
- 'k8s_operator_dbt'
- 'sales'
BASH_SECRETS_DBT_PRJ_MARKETING:
PLATFORM: 'Airflow'
DAG_OWNER: 'Marketing Analytics'
NAMESPACE: 'data-orchestration'
OPERATOR: 'bash'
PROJECT_ID: 'my-gcp-project'
PROJECT_LEVEL: 'production'
DBT_PROJECT_NAME: 'marketing_analytics'
DAG_SCHEDULE_INTERVAL: '@hourly'
DAG_TAG:
- 'bash_operator_dbt'
- 'marketing'
API_SECRETS_DBT_PRJ_FINANCE:
PLATFORM: 'Airflow'
DAG_OWNER: 'Finance Team'
NAMESPACE: 'dbt-server'
OPERATOR: 'api'
PROJECT_ID: 'my-gcp-project'
PROJECT_LEVEL: 'production'
DBT_PROJECT_NAME: 'finance_reporting'
DAG_SCHEDULE_INTERVAL: '0 */4 * * *'
DAG_TAG:
- 'api_operator_dbt'
- 'finance'
GKE_SECRETS_DBT_PRJ_EXTERNAL:
CLUSTER_NAME: 'client-workload-cluster'
CLUSTER_ZONE: 'europe-central2'
CLUSTER_NODE_COUNT: '3'
CLUSTER_MACHINE_TYPE: 'e2-highcpu-4'
CLUSTER_MACHINE_DISK_TYPE: 'pd-ssd'
NETWORK: 'projects/shared-vpc-project/global/networks/main-vpc'
SUBNETWORK: 'projects/shared-vpc-project/regions/europe-central2/subnetworks/analytics-subnet'
OPERATOR: 'gke'
PROJECT_ID: 'client-gcp-project'
PROJECT_LEVEL: 'production'
DBT_PROJECT_NAME: 'client_analytics'
DAG_SCHEDULE_INTERVAL: '@daily'
K8S_SECRETS_DBT_PRJ_QUALITY:
PLATFORM: 'Airflow'
DAG_OWNER: 'Data Quality Team'
NAMESPACE: 'data-orchestration'
OPERATOR: 'k8s'
PROJECT_ID: 'my-gcp-project'
DBT_PROJECT_NAME: 'data_quality_checks'
DATA_QUALITY: 'True'
MODEL_DEBUG_LOG: 'True'
DAG_SCHEDULE_INTERVAL: '0 0 * * *'
DBT_SOURCE: 'True'
DBT_SOURCE_SHARDING: 'True'
API_SECRETS_DBT_PRJ_REPLICATION:
PLATFORM: 'Airflow'
OPERATOR: 'api'
PROJECT_ID: 'my-gcp-project'
DBT_PROJECT_NAME: 'data_replication'
AIRBYTE_CONNECTION_ID: 'abc123'
AIRBYTE_REPLICATION_FLAG: 'True'
AIRBYTE_WORKSPACE_ID: 'workspace_1'
DAG_SCHEDULE_INTERVAL: '*/30 * * * *'
GKE_SECRETS_DBT_PRJ_SHARED:
CLUSTER_NAME: 'shared-analytics-cluster'
CLUSTER_ZONE: 'europe-central2'
SHARED_VPC: 'true'
NETWORK: 'projects/shared-vpc-project/global/networks/shared-vpc'
SUBNETWORK: 'projects/shared-vpc-project/regions/europe-central2/subnetworks/analytics'
OPERATOR: 'gke'
PROJECT_ID: 'shared-gcp-project'
DBT_PROJECT_NAME: 'shared_analytics'
DATAHUB_ENABLED: 'True'
K8S_SECRETS_DBT_PRJ_ETL:
PLATFORM: 'Airflow'
OPERATOR: 'k8s'
PROJECT_ID: 'my-gcp-project'
DBT_PROJECT_NAME: 'daily_etl'
DAG_SCHEDULE_INTERVAL: '@daily'
DBT_SOURCE: 'True'
DBT_SNAPSHOT: 'True'
DATA_QUALITY: 'True'
API_SECRETS_DBT_PRJ_REALTIME:
PLATFORM: 'Airflow'
OPERATOR: 'api'
PROJECT_ID: 'my-gcp-project'
DBT_PROJECT_NAME: 'realtime_analytics'
DAG_SCHEDULE_INTERVAL: '*/15 * * * *'
MODEL_DEBUG_LOG: 'True'
DATA_QUALITY: 'True'
GKE_SECRETS_DBT_PRJ_CLIENT:
CLUSTER_NAME: 'client-isolated-cluster'
CLUSTER_ZONE: 'europe-central2'
PRIVATENODES_IP: '10.0.0.0/28'
OPERATOR: 'gke'
PROJECT_ID: 'client-project'
DBT_PROJECT_NAME: 'client_workload'
DATA_QUALITY: 'True'
DATAHUB_ENABLED: 'True'
Hourly Updates
DAG_SCHEDULE_INTERVAL: '@hourly'
Custom Cron
DAG_SCHEDULE_INTERVAL: '0 */4 * * *' # Every 4 hours
Daily at Specific Time
DAG_SCHEDULE_INTERVAL: '0 2 * * *' # Daily at 2 AM
Full Monitoring Stack
DEBUG: 'True'
MODEL_DEBUG_LOG: 'True'
DATA_QUALITY: 'True'
DATAHUB_ENABLED: 'True'
Data Replication Setup
AIRBYTE_REPLICATION_FLAG: 'True'
AIRBYTE_CONNECTION_ID: 'connection_id'
AIRBYTE_WORKSPACE_ID: 'workspace_id'
Sharding Configuration
DBT_SOURCE: 'True'
DBT_SOURCE_SHARDING: 'True'
DBT_SEED: 'True'
DBT_SEED_SHARDING: 'True'
Project Naming
Resource Configuration
Integration Setup
Boolean Values
'True'
or 'False'
true
/false
)Project Names
Schedule Intervals
Operator Selection
Consideration | Recommended Operator |
---|---|
Cost Optimization | K8S Operator |
Performance Priority | API Operator |
Balanced Resources | Bash Operator |
Client Isolation | GKE Operator |
Resource Scaling
Cost Management
Access Control
Network Security
Data Protection
Essential Metrics
Health Checks
Alerting
Connection Issues
DEBUG: 'True'
MODEL_DEBUG_LOG: 'True'
Performance Problems
Integration Failures
Configuration Validation
Resource Planning
Change Management
Daily Checks
Weekly Reviews
Monthly Maintenance
Required Documentation
Change Records
Internal Support
External Resources
Configuration Management
Deployment Process