Amazon EMR – A Migration Plan
Amazon Web Services (AWS) offers their Amazon Elastic MapReduce (EMR) tool for big data processing and analysis. The MapReduce software frame allows vast amounts of data to be processed quickly and cost- effectively. In addition, EMR securely and reliably handles a broad set of big data use cases, including log analysis, web indexing, data transformations (ETL), machine learning, financial analysis, scientific simulation, and bioinformatics. This is accomplished by using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, and Presto, coupled with the dynamic scalability of Amazon EC2 and scalable stores of Amazon S3. Whether you are running a single purpose, short lived cluster or a long running highly available cluster, Amazon EMR is a tool that will provide your organization the flexibility you have been looking for. Let’s explore further the benefits that Amazon EMR will provide to your business.
Getting Started – Amazon EMR Migration Approaches
When starting your organization’s journey to migrate your big data platform to the cloud, you must first decide how to approach migration. There are 3 approaches.
1. Re-architect your platform to maximize the benefits of the cloud. This approach requires research, planning, experimentation, education, implementation, and deployment. These efforts cost resources and time but generally provide the greatest rate of return as reduced hardware and storage costs, operational maintenance, and most flexibility to meet future business needs.
2. Lift and shift approach takes your existing architecture and completes a straight migration to the cloud. The lift and shift approach is the ideal way of moving workloads from on-premises to the cloud when time is critical and ambiguity is high. In addition, there is less risk and shorter time to market.
3. Hybrid approach is where you blend a lift and shift with re-architecture approach. This hybrid approach includes the benefit of being able to experiment and gain experience with cloud technologies and paradigms before moving to the cloud.
Although there are pros and cons to each, it is imperative to agree on the migration approach your organization is taking before you move to the next step, prototyping.
Amazon EMR Prototyping
When moving to a new and unfamiliar product or service, there is always a period of learning. Usually, the best way to learn is to prototype and learn from doing, rather than researching alone, to help identify the unknowns early in the process so you can plan for them later. Make prototyping mandatory to challenge assumptions. Common assumptions when working with new products and services include the following:
1. A particular data format is the best data format for my use case.
2. A particular application is more performant than another application for processing a specific workflow.
3. A particular instance type is the most cost-effective way to run a specific workflow.
4. A particular application running on-premises should work identically on cloud.
There are best practices for prototyping and a AWS partner can help you through these to ensure all assumptions are validated to a high degree of certainty.
Choosing a Team
When starting a migration to the cloud, you must carefully choose your project team to research, design, implement, and maintain the new cloud system. We recommend that your team has individuals in the following roles with the understanding that a person can play multiple roles:
1. Project Leader
2. Big data application engineer
3. Infrastructure engineer
4. Security engineer
5. Group of engineers
Getting started with your migration plan will consist of determining your migration approach, prototyping and choosing your team. Once these critical items are identified your organization will be able to move to the next steps of the migration plan. These include gathering requirements, cost estimation, migrating the data and ongoing support.
Cloud Rush is a certified AWS partner. They specialize in cloud assessments, strategy and planning, cloud migration, managed cloud services, as well as disaster recovery. Our “service that never sleeps” approach take a hands-on human approach to IT. Let Cloud Rush work with you to start your Amazon EMR migration journey together.