MemSQL Introduces Direct Ingest From Amazon S3 and HDFS With MemSQL Loader

Date 2015/1/6 9:24:09 | Topic: Product News

MemSQL has introduced a new productivity tool that dramatically improves data ingest from popular data stores like Amazon Web Services S3 and the Hadoop Distributed File System (HDFS). Whereas typical data loading often requires multiple steps, the MemSQL Loader enables direct streaming from the originating datastore in a single transfer. MemSQL allows for multiple parallel input streams, further increasing performance and reducing time-consuming repetitive operations. The MemSQL Loader is also available as open source, providing developers the ability to adapt it to their favorite data source.
"The MemSQL Loader is another innovation of simplifying MemSQL implementations with production data workflows," said Nikita Shamgunov, Chief Technology Officer and co-founder, MemSQL. "After working with customers during their MemSQL deployments, we found a simple way to eliminate steps in data pipelines, saving them time and reducing complexity," he continued. "By streaming directly from popular data stores like Amazon S3 and HDFS, we offer customers an easy way to get started, and an efficient way to integrate the real-time transactions and analytics of MemSQL into existing environments."

Data ingest often involves multiple files or objects, particularly with scalable storage services like S3. In certain cases, customers may have hundreds or thousands of files to import. Other import methods operate at the job level, meaning that if just one file fails, the entire job must be restarted. MemSQL Loader supports loading batches of thousands of files or objects automatically without having to specify files individually. This enables a synchronization path, only loading new or changed files as they are updated, or restarting at a specific file in case of any import issues.

MemSQL Loader automates loading processes and enables queuing of jobs further simplifying ingest. Now MemSQL administrators have a thorough loading solution that eliminates steps from the process, scales performance across a distributed database, monitors file level granularity, and offers connectivity to common data stores like S3 and HDFS.

This article comes from Software Development Tools

The URL for this story is: