Course Information
Description
Builds on the students' Python programming base with a focus on data engineering tasks.Teaches Python data structures and relevant packages, including numpy and pandas.
Total Credits
3
Course Competencies
-
Apply NumPy for mathematical computations and 1-dimensional array manipulations to solve foundational data problems
-
Manipulate complex datasets using advanced Pandas techniques for data cleaning, transformation, and structural analysis
-
Implement professional version control and collaborative workflows using GitHub to manage technical project code
-
Resolve real-world data engineering challenges by integrating Artificial Intelligence tools into the cleansing and transformation workflow
-
Develop scalable data processing applications using the Apache Spark framework to handle high-volume datasets
-
Translate algorithmic logic and mathematical requirements into executable, efficient Python code
-
Construct end-to-end data pipelines that ensure data integrity and quality from raw ingestion through the engineering lifecycle
-
Evaluate technical requirements to select the optimal framework (NumPy, Pandas, or Spark) for a given data solution based on performance and scale
This Outline is under development.