The Growing Significance Of DevOps For Data Science


DevOps involves infrastructure provisioning, configuration management, continuous integration and deployment, testing and monitoring.  DevOps teams have been closely working with the development teams to manage the lifecycle of applications effectively.

Data science brings additional responsibilities to DevOps. Data engineering, a niche domain that deals with complex pipelines that transform the data, demands close collaboration of data science teams with DevOps. Operators are expected to provision highly available clusters of Apache Hadoop, Apache Kafka, Apache Spark and Apache Airflow that tackle data extraction and transformation. Data engineers acquire data from a variety of sources before leveraging Big Data clusters and complex pipelines for transforming it.

Read more at Forbes