BlueData Invites AI/ML Developers to Play in Its BDaaS Sandbox

Developers can prototype with ML / DL in a multi-node sandbox environment using Docker containers.

BlueData.logo

Kids love to play in physical sandboxes. Developers love to “play” in virtual sandboxes.

BlueData, which offers a new-gen big-data-as-a-service (BDaaS) software platform, has made available a new environment for AI and machine-learning developers to try out new ideas and have fun testing them. This is a new turnkey package that enables accelerated deployment of artificial intelligence, machine learning and deep learning applications in the enterprise.

Turns out you can’t build these applications too quickly. They’re too much in demand from all sectors of the IT world. And once one is up and running, the next one is ready to come into action.

The BlueData AI/ML Accelerator, introduced May 31, includes the software and professional services to deploy containerized multi-node sandbox environments for exploratory use cases with TensorFlow and other ML/DL tools, the Santa Clara, Calif.-based company said.

Convergence of Factors Enabing AI/ML Everywhere

The promise of AI has been around for several decades, but it was only recently that it has started to become more widely adopted in the enterprise due to the convergence of several factors: faster networking, piles of new data, more powerful but cooler-running processors, virtually infinite storage, vast improvement in cloud services and faster mobile devices.

Now AI is being explored and implemented for digital transformation initiatives in nearly every industry, using innovative new open source tools and algorithms for ML / DL, the immense volumes of data available and advances in high-performance data processing infrastructure.

In fact, AI and ML / DL have moved into the mainstream with a broad range of data-driven enterprise applications: credit-card fraud detection, stock market prediction for financial trading, credit-risk modeling for insurance, genomics and precision medicine, disease detection and diagnosis, natural language processing (NLP) for customer service, autonomous driving and connected car IoT use cases and many other needs.

One of the most popular ML / DL tools is Google's TensorFlow, often used together with tools such as Python and new-gen GPUs to create an end-to-end pipeline from data preparation to modeling, scoring, and inference. However, there are many other open source and commercial tools that may be used, depending on the use case.

Data Scientists, Developers Require Rapid Prototyping

Data scientists and developers want to evaluate and work with a variety of ML / DL tools, and they need rapid prototyping to compare different libraries and techniques. In most large organizations, they also need to comply with enterprise security, network, storage, user authentication, and access policies, BlueData said.

These users often start with a single-node environment; but these technologies are difficult to implement in multi-node distributed environments for large-scale enterprise use cases. It’s a complex software stack, requiring version compatibility and integration across many different components. Most enterprises lack the staff and skills to deploy and configure these tools with their existing data infrastructure and systems–whether on-premises, in the public cloud, using CPUs and/or GPUs, with a data lake or with cloud storage.

Thus, the BlueData AI / ML Accelerator provides a turnkey solution to address these challenges, including:

  • ready-to-run Docker images of popular ML / DL tools (including TensorFlow, SparkMLlib, H2O, Caffe2, Anaconda, and BigDL) for use in large-scale distributed computing environments;
  • the ability to spin up new ML / DL environments in a matter of minutes via self-service, with REST APIs or a few mouse clicks in a web UI;
  • secure integration with distributed file systems including HDFS, NFS, and S3 for storing data and ML / DL models;
  • automated and reproducible provisioning, enabling on-demand creation of identical ML / DL environments and reproducible results; and
  • professional services, training, and support to accelerate AI initiatives and deliver business outcomes with ML / DL.

Enterprises can get up and running quickly with distributed ML / DL applications in multi-node containerized environments on any infrastructure, whether on-premises or in the cloud, using either CPUs and/or GPUs. Fully-configured environments can be provisioned in minutes, using self-service and automation.

Data scientists and developers can build prototypes, experiment and iterate with their preferred ML / DL tools for faster time-to-value. Their IT teams can ensure enterprise-grade security, data protection, and performance – with elasticity, flexibility, and scalability in a multi-tenant architecture.

The new platform is designed for out-of-the-box deployments with open source technologies including TensorFlow, SparkMLlib, H2O, Caffe2, Anaconda, and BigDL. However, it can be easily configured and extended for use with other ML / DL technologies, including open source tools and commercial applications. While initial implementations may focus on prototypes and pre-production environments, the solution is extensible to large-scale AI / ML / DL production deployments.

Intel a Major Investor in BlueData

Intel has a big stake in the success of BlueData as part of a $20 million funding round led by Intel Capital, the chip maker's investment arm, in 2015.

Officials with both Intel and BlueData said that the vendors' partnership is designed to accelerate the adoption of big data technology by making it easier to deploy. As part of the collaboration, BlueData has optimized its EPIC big data infrastructure software for the Intel Architecture, while Intel brings engineering and marketing resources to help create a joint engineering roadmap and develop joint customer acquisitions.

The BlueData AI / ML Accelerator includes a one-year subscription for BlueData EPIC software along with professional services, training, and support to assist in customers’ AI and ML / DL deployments.

For more information, go here.

Chris Preimesberger

Chris J. Preimesberger

Chris J. Preimesberger is Editor-in-Chief of eWEEK and responsible for all the publication's coverage. In his 13 years and more than 4,000 articles at eWEEK, he has distinguished himself in reporting...