Data Engineering Mock Interview | Myntra | Part 1

Описание к видео Data Engineering Mock Interview | Myntra | Part 1

Data Engineering Mock Interview - Part 1

This is a summary of a data engineering mock interview between Kuldeep Pal and Vipul, a senior software engineer at Myntra. The interview is over 2 hours long and is released in multiple parts.

The interview covers a wide range of topics related to data engineering, including designing and implementing scalable, high-performance batch-processing architectures and working with popular data processing frameworks like Kafka, Apache Spark, Airflow, and AWS.

Whether you're starting your career in data engineering or looking to enhance your skills, this interview is a valuable resource to gain insights into the real-time data processing and engineering field. If you're interested in booking a mock interview, you can visit the provided URL.

🔅 To book a Mock interview - https://topmate.io/ankur_ranjan/15155


You can also follow the interviewee on LinkedIn using the provided links.

🔅 Kuldeep (Interviewer) -   / kuldeep27396  
🔅 Vipul (Interviewee) -   / vipul-singhal-21a831125  


The interview covers various topics, including the architecture of #Spark, the difference between RDD, Dataframe, and dataset, spark optimization, shuffling in Spark, and more. It also includes examples of wide and narrow transformation and the reasons behind having GroupByKey, ReduceByKey, and SortByKey. The interview ends with a scenario-based question.

If you're interested in data engineering, big data, or a career switch, this interview is a must-watch. Don't miss out on the secrets of success in this exciting and dynamic field!

Do watch the next video for the remaining content.

𝗝𝗼𝗶𝗻 𝗺𝗲 𝗼𝗻 𝗦𝗼𝗰𝗶𝗮𝗹 𝗠𝗲𝗱𝗶𝗮:
🔅 LinkedIn -   / thebigdatashow  
🔅 Instagram -   / ranjan_anku  

Chapters:
00:00 - Introduction
01:45 - Architecture of Spark
03:35 - How Spark is an in-memory computing Engine?
06:15 - What is the difference between RDD, Dataframe and dataset and why do we use them?
07:28 - What are cache and persist and what is the difference between them?
09:30 - Spark Optimization
12:23 - Shuffling in Spark
14:18 - Examples of Wide and Narrow Transformation
15:00 - Why do we have GroupByKey, ReduceByKey, SortByKey?
15:50 - Spark Scenario Based Question - 1
30:59 - Trade-off between Spark SQL and DataFrame
32:41 - DSA Question - 1

#interview #dataengineering #bigdata #apachespark #careerswitch #job #mockinterview

Комментарии

Информация по комментариям в разработке