Documentation
¶
Overview ¶
Package dataflowlib translates a Beam pipeline model to the Dataflow API job model, for submission to Google Cloud Dataflow.
Index ¶
- func Execute(ctx context.Context, raw *pipepb.Pipeline, opts *JobOptions, ...) (*dataflowPipelineResult, error)
- func FromMetricUpdates(allMetrics []*df.MetricUpdate, p *pipepb.Pipeline) *metrics.Results
- func GetMetrics(ctx context.Context, client *df.Service, project, region, jobID string) (*df.JobMetrics, error)
- func GetRunningJobByName(client *df.Service, project, region string, name string) (*df.Job, error)
- func NewClient(ctx context.Context, endpoint string) (*df.Service, error)
- func PrintJob(ctx context.Context, job *df.Job)
- func ResolveXLangArtifacts(ctx context.Context, edges []*graph.MultiEdge, project, url string) ([]string, error)
- func StageModel(ctx context.Context, project, modelURL string, model []byte) error
- func Submit(ctx context.Context, client *df.Service, project, region string, job *df.Job, ...) (*df.Job, error)
- func Translate(ctx context.Context, p *pipepb.Pipeline, opts *JobOptions, ...) (*df.Job, error)
- func WaitForCompletion(ctx context.Context, client *df.Service, project, region, jobID string) error
- type JobOptions
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Execute ¶
func Execute(ctx context.Context, raw *pipepb.Pipeline, opts *JobOptions, workerURL, modelURL, endpoint string, async bool) (*dataflowPipelineResult, error)
Execute submits a pipeline as a Dataflow job.
func FromMetricUpdates ¶
FromMetricUpdates extracts metrics from a slice of MetricUpdate objects and groups them into counters, distributions and gauges.
Dataflow currently only reports Counter and Distribution metrics to Cloud Monitoring. Gauge metrics are not supported. The output metrics.Results will not contain any gauges.
func GetMetrics ¶
func GetMetrics(ctx context.Context, client *df.Service, project, region, jobID string) (*df.JobMetrics, error)
GetMetrics returns a collection of metrics describing the progress of a job by making a call to Cloud Monitoring service.
func GetRunningJobByName ¶
GetRunningJobByName gets a Dataflow job running by its name and returns an error if none match.
func NewClient ¶
NewClient creates a new dataflow client with default application credentials and CloudPlatformScope. The Dataflow endpoint is optionally overridden.
func ResolveXLangArtifacts ¶
func ResolveXLangArtifacts(ctx context.Context, edges []*graph.MultiEdge, project, url string) ([]string, error)
ResolveXLangArtifacts resolves cross-language artifacts with a given GCS URL as a destination, and then stages all local artifacts to that URL. This function returns a list of staged artifact URLs.
func StageModel ¶
StageModel uploads the pipeline model to GCS as a unique object.
func Submit ¶
func Submit(ctx context.Context, client *df.Service, project, region string, job *df.Job, updateJob bool) (*df.Job, error)
Submit submits a prepared job to Cloud Dataflow.
Types ¶
type JobOptions ¶
type JobOptions struct {
// Name is the job name.
Name string
// Experiments are additional experiments.
Experiments []string
// DataflowServiceOptions are additional job modes and configurations for Dataflow
DataflowServiceOptions []string
// Pipeline options
Options runtime.RawOptions
Streaming bool
Project string
Region string
Zone string
KmsKey string
Network string
Subnetwork string
NoUsePublicIPs bool
NumWorkers int64
DiskSizeGb int64
DiskType string
MachineType string
Labels map[string]string
ServiceAccountEmail string
WorkerRegion string
WorkerZone string
ContainerImage string
ArtifactURLs []string // Additional packages for workers.
FlexRSGoal string
EnableHotKeyLogging bool
// Streaming update settings
Update bool
TransformNameMapping map[string]string
// Autoscaling settings
Algorithm string
MaxNumWorkers int64
WorkerHarnessThreads int64
TempLocation string
TemplateLocation string
// Worker is the worker binary override.
Worker string
TeardownPolicy string
}
JobOptions capture the various options for submitting jobs to Dataflow.