The Introduction to Data Science with Python course at the Bangladesh-Korea Information Access Center (BK-IAC), Department of CSE, BUET, is designed for students, professionals, and aspiring data practitioners who want to build a strong foundation in data science. This course introduces the complete data science workflow, including data collection, data cleaning, preprocessing, exploratory data analysis, visualization, statistical reasoning, basic machine learning, and practical case-study based analysis using Python.
The course emphasizes hands-on learning through real datasets and practical exercises. Participants will learn how to transform raw data into meaningful insights and how to communicate findings effectively through visualizations, reports, and simple predictive models.
This course is suitable for beginners who are interested in data science. Basic computer literacy is required. Prior programming knowledge, especially in Python, will be helpful but is not mandatory. The course will include a brief introduction to Python programming for data science; however, participants without any programming background are encouraged to learn the basics of Python before or during the early part of the course.
Participants who want to strengthen their Python foundation may complete the "Introduction to Python" course offered by BK-IAC before enrolling in this course: Course Details Link.
The course length will be 8 weeks with two classes in each week and 3 hours in each class. The tentative lecture plan of the course is as follows:
| Class# | Content |
|---|---|
| 1 | Introduction to Data Science: Concepts, Applications, Data Science Workflow, Tools, and Python Environment Setup |
| 2 | Python Essentials for Data Science: Variables, Data Types, Lists, Dictionaries, Control Flow, Functions, and Jupyter Notebook |
| 3 | Numerical Computing with NumPy: Arrays, Indexing, Vectorized Operations, Aggregation, and Basic Numerical Analysis |
| 4 | Data Handling with Pandas: Series, DataFrames, Loading CSV/Excel Files, Selection, Filtering, Sorting, and Basic Data Inspection |
| 5 | Data Cleaning and Preprocessing: Missing Values, Duplicates, Outliers, Data Type Conversion, Encoding, Scaling, and Feature Preparation |
| 6 | Exploratory Data Analysis - I: Descriptive Statistics, GroupBy, Aggregation, Correlation Study, and Summary Tables |
| 7 | Data Visualization - I: Line Plot, Bar Plot, Histogram, Box Plot, Scatter Plot, and Visualization Principles using Matplotlib |
| 8 | Data Visualization - II: Advanced Visualization using Seaborn, Pairplot, Heatmap, Violin Plot, Distribution Plot, and Assignment 1 on Data Cleaning, EDA, and Visualization |
| 9 | Basic Statistics for Data Science: Mean, Median, Variance, Standard Deviation, Probability Concepts, Distributions, and Sampling |
| 10 | Statistical Reasoning and Hypothesis Testing: Confidence Interval, p-value, t-test, Chi-square Test, and Interpreting Statistical Results |
| 11 | Introduction to Machine Learning: Supervised vs. Unsupervised Learning, Train-Test Split, Features, Labels, and Model Development Workflow |
| 12 | Regression Models: Simple Linear Regression, Multiple Linear Regression, Model Fitting, Prediction, and Error Metrics |
| 13 | Classification Models: Logistic Regression, Decision Tree, k-Nearest Neighbors, Accuracy, Precision, Recall, F1-score, and Confusion Matrix |
| 14 | Unsupervised Learning and Data Mining: Clustering, k-Means, Dimensionality Reduction Concepts, and Evaluation of Assignment 1 |
| 15 | Real-World Case Study: End-to-End Data Science Project using a Practical Dataset, Insight Generation, Visualization, and Model Interpretation |
| 16 | Evaluation of Final Case Study/Project, Presentation of Findings, Discussion on Data Ethics, Career Pathways, and Future Learning Directions |
Email: iac@cse.buet.ac.bd
Phone: 9665650-80 Ext-6438
Mobile: 01670032959