New MemSQL Enhances Real-Time Data Pipelines for Spark and Python

Date 2015/12/17 8:40:41 | Topic: Product News

MemSQL, a leader in real-time databases for transactions and analytics, announced significant advances for creating real-time data pipelines for Apache Spark, as well as support for the Python language and Non-Uniform Memory Access (NUMA) architectures in the latest version of MemSQL Ops. MemSQL can now run Spark SQL queries inside of the MemSQL database, provide in-browser Python programming, and automatically optimize NUMA deployments. These features drive rapid results and faster analytics for data scientists.
As a transient processing framework, Spark is well suited for data analysis and model development, but it is not purpose built for high performance SQL. To that end, MemSQL now allows Spark SQL queries to run inside of the MemSQL database, which can improve performance by up to 50x on many workloads. By combining MemSQL with Spark, data scientists can tap a permanent, transactional datastore to feed the latest business data into their models for real-time analytics.

Moreover, the combination of Spark and MemSQL further unifies in-memory processing with in-memory storage for lightning fast results. Users have access to a familiar SQL interface, which provides the performance and persistence to run real-time data pipelines successfully. Spark data transformation capabilities can be fully utilized when paired with distributed, in-memory stores like MemSQL, compared to traditional disk-based stores like HDFS.

The latest release of MemSQL Ops also features in-browser Python programming, which opens up Python's vast library of analysis packages such as Numpy, Scipy and Pandas to users running MemSQL. These libraries, as well as the prototyping speed of Python, have made Python incredibly popular among data scientists, application developers and database administrators alike.

For users running MemSQL in a NUMA environment, MemSQL Ops now offers point-and-click installation. MemSQL Ops can intelligently map MemSQL instances to CPUs that share local memory. The increased efficiency on large server deployments can accelerate queries by up to 40%. From ultra-fast query execution to efficient storage of business data, MemSQL enables users to operate with maximum efficiency in fast-paced production environments. 

This article comes from Software Development Tools

The URL for this story is: