10156135Data Engineering
Course Information
Description
Builds on the students' Python programming base with a focus on data engineering tasks.Teaches Python data structures and relevant packages, including numpy and pandas.
Total Credits
3

Course Competencies
  1. Apply NumPy for mathematical computations and 1-dimensional array manipulations to solve foundational data problems

  2. Manipulate complex datasets using advanced Pandas techniques for data cleaning, transformation, and structural analysis

  3. Implement professional version control and collaborative workflows using GitHub to manage technical project code

  4. Resolve real-world data engineering challenges by integrating Artificial Intelligence tools into the cleansing and transformation workflow

  5. Develop scalable data processing applications using the Apache Spark framework to handle high-volume datasets

  6. Translate algorithmic logic and mathematical requirements into executable, efficient Python code

  7. Construct end-to-end data pipelines that ensure data integrity and quality from raw ingestion through the engineering lifecycle

  8. Evaluate technical requirements to select the optimal framework (NumPy, Pandas, or Spark) for a given data solution based on performance and scale

This Outline is under development.