Qubole Adds Apache Spark to Its Big Data-as-a-Service Platform

Date 2015/2/19 9:56:35 | Topic: Product News

Qubole, the big data-as-a-service company founded by the team the developed Facebook's data infrastructure, has announced the addition of Apache Spark to the Qubole Data Service (QDS) platform. With the addition of Apache Spark, QDS broadens the types of workloads data scientists and data analysts can run on demand via the cloud, without the hassles, costs and risks of deploying Spark on-premises.
Qubole Data Service (QDS) is a self-service platform for big data analytics that runs on the three major public clouds: Amazon AWS, Google Compute Engine and Microsoft Azure. Among its key advantages, QDS automatically sets up and scales up a cluster to match the needs of the particular job, and then winds down nodes when they're no longer needed. QDS is a fully managed big data offering that leverages the latest open source technologies, such as Apache Hadoop, Hive, Presto, Pig, Oozie, Sqoop and now Spark, to provide the only comprehensive, "everything as a service" data analytics platform complete with enterprise security features, an easy to use UI and built in data governance.

With the addition of Apache Spark, Qubole customers gain access to the fast in-memory processing capabilities of Spark that make it ideal for machine learning and predictive analytics applications. Data scientists can set up a Spark cluster in QDS in less than 15 minutes directly within the QDS web interface, and like all QDS services, the Spark feature auto-scales based on workload, ensuring the most efficient and cost-effective use of compute resources.

Qubole's Spark-as-a-Service is truly self-service and makes it easy to set up multiple user accounts and to launch and configure multiple Spark clusters as needed. It uses the familiar Spark notebook style interface, which facilitates collaboration among data scientists, and accepts commands in Scala, Python and R programming languages. QDS provides inline results and template visualizations for Spark queries, eliminating the need to open multiple windows or manage multiple applications. It also runs automatic health checks, alerts users of bad nodes and automates replacement of bad nodes, improving productivity.

This article comes from Software Development Tools

The URL for this story is: