gantt
title Preparing Polyglot Notebooks Talk for Stir Trek 2023
axisFormat %m-%d
section Proposal <br> and <br> Evaluation
Submit Abstract :done, 2023-01-15, 2023-02-18
Session Evaluation :done, EVAL, 2023-02-18, 2023-03-05
Talk Accepted :milestone, done, after EVAL,
section Talk <br> Preparation
Research & Outlining :done, OUTLINE, 2023-03-12, 9d
Create Mermaid Examples :done, MER_EXAMPLE, after OUTLINE, 5d
Write Mermaid Articles :active, MER_ART, after MER_EXAMPLE, 7d
Write Jupyter Articles : after MER_ART, 3d
section Delivery
Final Notebook :crit, NOTEBOOK, 2023-04-19, 7d
Rehearsal :crit, after NOTEBOOK, 2023-05-04
Stir Trek 2023 :milestone, crit, 2023-05-05, 1d
2025-2026
1 General instructions
For this course, the grading consists of a long-term project.
You will need to work all along the semester to master it. Please start to work on it as soon as possible. You need to use your GitHub account (see Git lecture) as the expected outputs for this course are GitHub repositories.
2 Team project:
It will be mostly a visualization project, hopefully an interactive one, with the creation of an app or website, helping to navigate your work. The project consists of an investigation of an economic, social, ecological or public health subject (of your choice), through the analysis of (public) datasets. Concrete examples could be:
- sources of energy, see this webpage or this app
- population/demographic issues, see this advanced interractive app
- an ecological issue like this one
- sncf traffic, …
Bonus: one can also make predictions (future population in an area, future electric consumption, …) developping a Machine Learning/Statistical model and presents these predictions in the app/website.
2.1 Some source of datasets:
Here some dataset website are listed:
- openstreetmap (api)
- openrouteservice (api)
- gapminder
- USenergyadmin
- Kaggle
2.2 Setting and objectives
The group composition is available on Moodle.
The project repository must show a balanced contribution between group members and intra-group grade variation could be made to reflect issues on the intra-group workload balance. Group issues must be reported early to the teaching team.
2.3 Timing
Mid-term project snapshot: This part consists of starting the group project, explaining the main question of interest chosen to investigate. You should show preliminary work and organization of the workload, git first steps for the group project, etc. See details below Section 2.3.1. This part corresponds to 25% of the final grade. The due date is October 25 (23:59).
The GitHub repository with the presentations slides should be completed before Wednesday 10 December (23:59). This part corresponds to 40% of the final grade. Nothing pushed after the deadline will be taken into account.
The oral presentation (15mn + 5mn questions) December 08 (8:00) (SC 36.04). This part corresponds to 35% of the final grade.
2.3.1 Mid-term project snapshot
Please provide the URL of the group repository and the group composition (do not forget your student number) in Moodle.
The main point for this step is to create a README.qmd in a roadmap directory at the root of your project. Your file should give the outline of the project with the following ingredients:
2.3.1.1 Elements expected for the mid-term project
| General | Details | Points (out of 20) |
|---|---|---|
| Mid-term | Git / branches | 4 |
| Task affectation / Gantt chart | 4 | |
| Dataset choices / Download / Description | 4 | |
| Packages/software description for the project | 4 | |
| Figure of interest/narration | 4 | |
| Total | 20 |
2.3.2 Final project
2.3.2.1 General guidelines
The ultimate goal is to provide one of the two outputs:
- a runnable Python app that illustrates your data and the narration.
- a website with
quarto(or other libraries) that presents your project. This website should be deployed through GitHubPages. At least one of the pages should contain an interactive element (maps, widgets, etc.).
A description of the procedure will be needed (imagine you are addressing a user not aware of your package). An example of a project made in 2020, is available at https://github.com/tanglef/chaoseverywhere.
2.3.2.2 Project structure
All the code will be placed in a subdirectory called
/my_module_name(choosing your module name accordingly).A slide deck in Beamer/LaTex, quarto or LibreOffice will be put in a
/slidedirectory of the repository. The latter will be a short presentation of the work that will be orally presented during 15mn in front of a jury, at the end of the project.
2.3.2.3 Git aspects
A
.gitignorethat prevents garbage files from being included in your project.Equilibrated commits in two branches should be done (e.g., in the development branch and the master one), and merged for the milestone day.
Your repository should contain a
README.md:- containing the title and a short description of the project
- Source of data
- for those who choose the module/app, a description of how to run/install it
- for those who choose a website, the link to the website and a code snippet to build it
- authors list and a license description. See this website. A default choice could be MIT Licence.
2.3.2.4 Object programming aspects
You should code at least one
Pythonclass.Your
pythonproject should contain submodules. Each submodule will be devoted to a specific sub-task of your project.
2.3.2.5 Dataset(s)
- For reproducibility, the data should be easily available to anyone that want to check your work. It is better if the data used should be available in a way that the end user does not need to perform a manual download of any kind (use the
poochpackage or variants for instance).
2.3.2.6 Time/memory evaluation
- A full study of the time and memory footprint of the code produced will be provided for the whole pipeline used. Elements showing speed-up / memory savings you could find along the way would be also appreciated.
2.3.2.7 Documentation
Docstrings should be populated for every
pythonclass and function.You should create an API documentation using
sphinxfor instance.
2.3.2.8 Test and CI
- Provide unitary tests to check that the function you proposed satisfies the requirement you target.
- Bonus point will be given if you implement a Continuous Integration solution with GitHub that runs your unitary test at each commit.
Here’s a recap for the final project (code, app/website):
| General | Details | Points (/20) |
|---|---|---|
| Code | Problem Resolution/Narration | 3 |
| Readme/Installation | 2 | |
| Unit Tests | 3 | |
| Class (create at least 1 class) | 2 | |
| Code quality/Organization | 2 | |
| Graphical aspects: Widgets, clickable map, etc. | 4 | |
| Git branches | 2 | |
| Documentation | 2 | |
| Total· | 20 |
Bonus will be given for the following reasons:
- Continuous integration with Github
- A code very object-oriented (inheritance, abstraction, …)
- Automatic loading of dataset (with
poochfor instance) - Time/Memory efficiency assessment
- Originality
Here’s also a short recap for the presentation part (slides, oral presentation):
| General | Details | Points (/20) |
|---|---|---|
| Oral | Slides quality and structure | 10 |
| Clarity / lively presentation / Rhythm / Show | 10 | |
| Total· | 20 |
3 Examples
Here are some links to the Github pages showing what have been done the last year: