Google Ads
Product News : Google Announces Open-Source Cloud Dataflow SDK for Java
on 2015/1/8 8:25:19 (586 reads)
Product News

Google is announcing availability of the Cloud Dataflow SDK as open-source. This will make it easier for developers to integrate with Google managed service while also forming the basis for porting Cloud Dataflow to other languages and execution environments.


Google learned a lot about how to turn data into intelligence as the original FlumeJava programming models (basis for Cloud Dataflow) have continued to evolve internally at Google. Why share this via open source? It’s so that the developer community can:
* Spur future innovation in combining stream and batch based processing models: Reusable programming patterns are a key enabler of developer efficiency. The Cloud Dataflow SDK introduces a unified model for batch and stream data processing. Our approach to temporal based aggregations provides a rich set of windowing primitives allowing the same computations to be used with batch or stream based data sources. We will continue to innovate on new programming primitives and welcome the community to participate in this process.
* Adapt the Dataflow programming model to other languages: As the proliferation of data grows, so do programming languages and patterns. We are currently building a Python 3 version of the SDK, to give developers even more choice and to make dataflow accessible to more applications.
* Execute Dataflow on other service environments: Modern development - especially in the cloud - is about heterogeneous service and composition. Although we are building a massively scalable, highly reliable, strongly consistent managed service for Dataflow execution, we also embrace portability. As Storm, Spark, and the greater Hadoop family continue to mature - developers are challenged with bifurcated programming models. We hope to relieve developer fatigue and enable choice in deployment platforms by supporting execution and service portability.

We look forward to collaboratively building a system that enables distributed data processing for users from all backgrounds. We encourage developers to check out the Dataflow SDK for Java on GitHub and contribute to the community. Visit https://github.com/GoogleCloudPlatform/DataflowJavaSDK

Printer Friendly Page Send this Story to a Friend Create a PDF from the article

Copyright (c) 2007-2014 Martinig & Associates | Methods & Tools Software Development Magazine | Privacy Policy
Software Development Articles | Software Development Directory | Software Development Videos
Software Development Jobs | Software Development News | Software Development Books
Software Development Blogs | Software Development Conferences