From SSH to REST: A Security-Driven Modernization of Slack’s EMR Data Pipelines
To eliminate SSH security risks, Slack's EMR data pipelines were modernized to a REST-based architecture, replacing 700+ SSH-based operators.
The key breakthrough was the use of YARN Distributed Shell, which allowed arbitrary shell commands to be executed in YARN containers with resource allocation and lifecycle management, leveraging existing REST APIs. This solution enabled the migration of all SSH-based jobs, including Hadoop workloads and custom shell commands, with zero downtime across 8 data regions.
CollaborationInfrastructure