Ray
Open-source unified framework for scaling Python and AI workloads.
Publisher review
Ray is an open-source distributed computing framework designed to scale Python and machine learning workloads from a laptop to thousands of GPUs. Built by UC Berkeley's RISELab and backed by Anyscale, Ray provides a Python-native runtime that executes tasks with microsecond latency and can handle millions of tasks per second—orders of magnitude faster than Apache Spark for AI patterns. The framework consists of Ray Core (primitives: tasks, actors, objects) and five specialized libraries: Ray Train for distributed model training and fine-tuning, Ray Tune for hyperparameter optimization, Ray Serve for model deployment and online inference, Ray Data for multi-modal data processing, and Ray RLlib for reinforcement learning workflows.
Ray excels at heterogeneous compute allocation, independently scaling CPUs and GPUs as workloads demand. It integrates natively with PyTorch, TensorFlow, Hugging Face, and other ML libraries, allowing them to work together in parallelized environments. OpenAI uses Ray to coordinate training of large language models, and the PyTorch Foundation now hosts the project.
For organizations moving from batch-heavy Spark pipelines to real-time AI inference, Ray typically delivers 30× cost reductions on GPU workloads. However, Ray trades maturity and institutional support for performance: its ecosystem remains smaller than Spark's, structured data processing is not its focus, and deployment without proper authentication exposes clusters to cryptomining attacks (200,000+ Ray servers exposed online as of 2025).
How it works
-
Ray Core
Distributed runtime with task, actor, and object primitives that transparently parallelizes Python code across clusters with microsecond task latency.
-
Ray Train
Distributed training library supporting multi-node model training, fine-tuning, and fault tolerance for PyTorch, TensorFlow, and Hugging Face models.
-
Ray Serve
Production inference framework that deploys models with independent scaling, enabling online prediction APIs and batch inference with GPU acceleration.
-
Ray Tune
Hyperparameter optimization engine integrating with population-based training, early stopping, and asynchronous scheduling across GPU clusters.
-
Ray Data
Framework-agnostic data loading and transformation supporting images, videos, audio, and structured data across training, tuning, and inference pipelines.
-
Heterogeneous compute scheduling
Automatically allocates CPU and GPU resources per task, allowing GPUs and CPUs to scale independently within the same cluster.
-
Decentralized fault tolerance
No single point of failure; uses object store replication and lineage recovery to handle worker failures transparently.
Strengths and trade-offs
Strengths
- Microsecond task latency and millions of tasks per second—10-30× faster than Spark for AI/ML workloads and parallel Python execution.
- Python-first design with native support for PyTorch, TensorFlow, Hugging Face, and other ML libraries working together in a single scalable environment.
- Heterogeneous compute allocation: GPUs and CPUs scale independently, enabling efficient use of mixed hardware and cost-effective inference.
Trade-offs
- Critical security vulnerability: unauthenticated API access allows remote code execution; 200,000+ exposed Ray servers online as of 2025, actively exploited for cryptomining.
- Smaller ecosystem and community compared to Spark; fewer third-party integrations, commercial support, and institutional knowledge in enterprises.
- Not optimized for large-scale distributed data processing; lacks SQL abstractions and ETL-centric features that make Spark the standard for data engineering.
Pricing context
Ray Core is free and open-source. Anyscale, the managed hosting platform, uses pay-as-you-go pricing with no monthly minimums. Compute costs depend entirely on instance type: CPU-only instances cost $0.0135/hour, NVIDIA T4 GPUs cost $0.57/hour, and high-end NVIDIA H100 GPUs cost $9.29/hour (H200 at $10.68/hour).
New accounts receive $100 in starter credits. Anyscale also offers BYOC (Bring Your Own Cloud) for deployment in any cloud or on-premises with enterprise support and volume discounts for committed usage. Total cost varies dramatically by GPU tier—an H100 instance costs 160× more than a CPU-only instance for the same hour.
Getting started with Ray
-
Install Ray via pip
Run pip install ray to download Ray Core from PyPI. Ray is free and open-source. Installation takes under five minutes on Python 3.8 or later. No sign-up or credentials required for local development and testing.
-
Initialize a cluster on your laptop
Import Ray and call ray.init() in your Python script to start a local cluster. This spawns worker processes and readies the runtime. Specify CPU and GPU counts if available. For single-machine evaluation, omit arguments and Ray auto-detects your hardware.
-
Define tasks with the remote decorator
Decorate Python functions with @ray.remote to convert them into distributed tasks. Specify resource demands: @ray.remote(num_cpus=2, num_gpus=1). Call these functions with .remote() instead of direct invocation. Ray queues them for parallel execution across cluster nodes without you managing scheduling.
-
Run distributed tasks and collect results
Submit tasks to the cluster and fetch results using ray.get(). Measure wall-clock time against serial execution to confirm speedup. Start with a small dataset or synthetic workload to verify Ray distributes compute correctly. Monitor the Ray dashboard (localhost:8265) to see worker utilization.
-
Deploy to Anyscale for production
Create a free Anyscale account and deploy your Ray code to managed clusters. Configure GPU allocation (T4, A100, H100 available), autoscaling policies, and job queues. Set up monitoring and cost tracking per instance type. Use BYOC for on-premises deployments with enterprise support.
Frequently Asked Questions
What is Ray distributed computing?
Ray is an open-source distributed computing framework scaling Python and machine learning workloads from laptops to thousands of GPUs. Built by UC Berkeley's RISELab and backed by Anyscale, it executes tasks with microsecond latency and handles millions per second—orders of magnitude faster than Apache Spark for AI workloads.
How does Ray compare to Spark?
Ray delivers 10-30× faster execution for AI/ML workloads with microsecond task latency and independent GPU/CPU scaling. However, Spark remains superior for large-scale data processing and ETL-centric tasks. Ray trades ecosystem maturity for performance: it lacks SQL abstractions and has fewer third-party integrations than Spark's established enterprise infrastructure.
What are Ray's main libraries?
Ray includes five specialized libraries: Ray Train for distributed model training and fine-tuning, Ray Tune for hyperparameter optimization, Ray Serve for model deployment and online inference, Ray Data for multi-modal data processing, and Ray RLlib for reinforcement learning. These libraries integrate natively with PyTorch, TensorFlow, and Hugging Face.
How much does Ray cost?
Ray Core is free and open-source. Anyscale's managed platform uses pay-as-you-go pricing with no monthly minimums. CPU-only instances cost $0.0135/hour, NVIDIA T4 GPUs cost $0.57/hour, and H100 GPUs cost $9.29/hour. New accounts receive $100 in starter credits. Costs vary 160× between CPU and high-end GPU tiers.
What security risks does Ray have?
Ray has a critical vulnerability: unauthenticated API access allows remote code execution. As of 2025, over 200,000 Ray servers were exposed online, actively exploited for cryptomining attacks. Deployments without proper authentication are highly vulnerable. Organizations must enable security controls before production use and keep systems patched against known exploits.
Who uses Ray and why?
OpenAI uses Ray to coordinate large language model training. Organizations moving from batch-heavy Spark pipelines to real-time AI inference typically see 30× cost reductions on GPU workloads. Ray excels at heterogeneous compute allocation, independently scaling CPUs and GPUs within mixed-hardware clusters for efficient, cost-effective inference operations.
Alternatives in this category
Integrations
How Ray compares
Direct head-to-head against 3 competitors. Picked by 7wData.
Ray
- Pricing
- Ray Core is free and open-source. Anyscale, the managed hosting platform, uses pay-as-you-go pricing with no monthly minimums. Compute costs depend entirely on instance type: CPU-only instances cost $0.0135/hour, NVIDIA T4 GPUs cost $0.57/hour, and high-end NVIDIA H100 GPUs cost $9.29/hour (H200 at $10.68/hour). New accounts receive $100 in starter credits. Anyscale also offers BYOC (Bring Your Own Cloud) for deployment in any cloud or on-premises with enterprise support and volume discounts for committed usage. Total cost varies dramatically by GPU tier—an H100 instance costs 160× more than a CPU-only instance for the same hour.
- Target
- Ray is an open-source distributed computing framework designed to scale Python and machine learning workloads from a laptop to thousands of GPUs.
- Deployment
- self-hosted
- Strength
- Microsecond task latency and millions of tasks per second—10-30× faster than Spark for AI/ML workloads and parallel Python execution.
- Watch for
- Critical security vulnerability: unauthenticated API access allows remote code execution; 200,000+ exposed Ray servers online as of 2025, actively exploited for cryptomining.
Modal
- Pricing
- Free tier: $30/month compute credits. Team: $250/month. Enterprise: custom pricing.
- Target
- AI-native startups and Python ML engineers running spiky, GPU-intensive inference or fine-tuning workloads.
- Deployment
- Serverless SaaS, Python-decorator-based, zero container management.
- Strength
- Per-second GPU billing with sub-4-second cold starts, covering inference, fine-tuning, and multi-node clusters without Kubernetes.
- Watch for
- Non-preemptible plus US regional surcharges stack to 3.75x advertised base rates on production workloads.
Dask
- Pricing
- Open-source, free. Managed layer Coiled: free 500 CPU-hours/month, then $0.05/CPU-hour.
- Target
- Data scientists scaling existing Pandas and NumPy pipelines without rewriting code, typically 1-10 TB datasets.
- Deployment
- Open-source library, local multi-core or self-managed cluster, plus managed cloud via Coiled.
- Strength
- Near-identical Pandas and NumPy API lets existing Python analysts parallelize data-prep pipelines with minimal code changes.
- Watch for
- Distributed scheduler has no high-availability failover: if it crashes, all in-flight tasks are lost and the cluster resets.
Apache Spark
- Pricing
- Open-source, free. Databricks DBUs from $0.15 to $0.65+/DBU plus separate cloud VM costs.
- Target
- Data engineers at large enterprises running ETL, batch processing, and SQL analytics at petabyte scale.
- Deployment
- Open-source, on-prem or any cloud, plus managed services via Databricks, AWS EMR, and Google Dataproc.
- Strength
- Single engine covering SQL analytics, structured streaming, and batch ETL, with Delta Lake ACID lakehouse integration via Databricks.
- Watch for
- JVM serialization and Py4J boundary overhead significantly slow PySpark workloads compared to Ray's shared-memory object store for tensor passing.
User reviews
No user reviews yet. Be the first to write one.
Sources
Reporting on this tool draws on these publicly available sources.
- www.ray.io — Ray's core positioning as distributed AI compute engine, key features, and use cases
- docs.ray.io — Ray architecture, five native libraries (Data, Train, Tune, Serve, RLlib), and distributed computing capabilities
- www.anyscale.com — Anyscale managed Ray pricing: hourly rates by GPU type (T4 $0.57/hr, A100 $4.96/hr, H100 $9.29/hr), BYOC option, and $100 starter credits
- domino.ai — Ray vs. Spark trade-offs: Ray's microsecond latency and actor-based asynchronous execution versus Spark's maturity and ETL focus
- medium.com — Ray architecture (decentralized metadata, microsecond latency), Python-first design, and positioning for AI/ML versus Spark's JVM orientation
- thehackernews.com — 200,000+ exposed Ray servers online, unpatched RCE vulnerability, ShadowRay 2.0 cryptomining exploitation