January edition of Berkeley Datahub Newsletter
Get regular updates about the cutting edge work happening with Berkeley Datahub and across Berkeley Data Science Teaching Stack
Why for this newsletter:
The Berkeley Data Science Teaching Stack is a collection of open-source tools that help enable large-scale data science research and education efforts across UC Berkeley. The objective of this newsletter is to regularly inform the interested audience about the major updates across all services available as part of this stack. Services include Datahub (Infrastructure), Otter Grader (Auto-grader for Python and R), Nbgitpuller (Notebook distribution tool), Datascience package (Introductory Python package), and case studies of notebook based teaching, etc.
Each version of this newsletter will have the following three mini objectives,
Share major product-related updates across Berkeley Data Science Teaching Stack.
Share case studies of teaching team(s)/student(s) using this stack.
Share logistical information related to upcoming workshops and specific Call To Actions (CTA) for some of these tools.
Berkeley Data Science Teaching Stack Fall 2021 update:
Datahub:
RStudio Upgrade: The R Datahub has been upgraded to have the latest version of R and RStudio (v1.4). One of the cool functionalities resulting from this upgrade is the ability to toggle between the rendered visual markdown mode and the rmd source mode by easily clicking on a toggle button present in the document toolbar in RStudio. Refer to this documentation to learn more about visual markdown functionality. New users can now work through a rendered notebook as they do in Jupyter.
Real-Time Collaboration (RTC) functionality allows multiple users to collaboratively view and edit notebooks in real-time in JupyterLab and RetroLab. We plan to pilot RTC in the Stat 159 course taught by Fernando Perez during Spring 2022 and use this feedback to build future iterations of this functionality. Refer to this documentation to learn more about this feature.
RetroLab is an interface within JupyterLab that presents a simplified experience for new users and is enabled in Datahub. RetroLab retains the core user interface (UI) elements from the classic notebook while incorporating some of the major features from the JupyterLab IDE. Refer to this announcement post from the Jupyter team to learn more about the core functionalities for RetroLab. The Jupyter team plans to retire “classic notebook” over time and RetroLab will be the replacement.
Nbgitpuller:
Teammate Yuvi Panda developed extensions for nbgitpuller service which simplifies auto-generating distributable links for Jupyter notebooks uploaded to Github. Currently, the extensions are available in Firefox and Chrome store. This will be a simpler way to generate links than the previous link generator.
Faculty Spotlight:
Instructor: David Broockman
Course Name: Political Science 3 (Introduction to Empirical Analysis and Quantitative Methods)
Jupyter Notebooks in R developed and used in Political Science 3 Course!
Guiding Questions for Broockman:
How did you leverage Berkeley Data Science Teaching Stack as part of the Political Science 3 Course during Fall 2021?
My entire class was based on Datahub and Otter Grader. The students had four problems sets over the course of the semester, and two in-class assignments each week. I also did two lectures each week. These were all done as notebooks, all hosted on datahub.
How did the stack help improve student engagement and/or learning outcomes?
Most of the problem sets and in-class assignments were auto-graded using OttR. This had a number of advantages for students and student engagement. First, students could get rapid feedback on whether or not they understood the concepts because they were able to get their grades back almost immediately. Second, the tools allowed me to pursue a "flipped classroom" model, which I think was great for student engagement and allowed students to engage in trial and error learning and ask lots of questions in class.
What was your experience using OttR grader?
Overall, my experience using the OttR grader was great. It saved me and my GSI's a lot of time and I think the students really appreciated how quickly they were able to get their grades back.
What are your biggest learnings using the stack?
Some of my biggest learnings are,
Jupyter notebooks work at scale
Flipped classroom model works especially well with the Jupyter notebook. Students can debug issues and misunderstandings live in class
Students pick up coding faster than I (and they) thought!
Need additional focus on conceptual questions (currently have multiple choice questions in notebooks; need to create more)
What will your advice be for the varied faculty intending to adopt these tools?
My advice to other faculty would be that it does take a bit of learning the tools upfront, but that the team does a great job of teaching you how to use the tools and supporting you. Then, once you get it, it's really seamless and an amazing stack of tools. I can't imagine not teaching my class any other way.
Scheduled Workshops:
Data Science Modules and Sociology workshop happening between 2-3 PM PST on Jan 14th, 2022. Registration form for this workshop.
Data Science Modules and Civil & Environmental Engineering (CEE) workshop happening between 2-3 PM PST on Jan 19th, 2022. Registration form for this workshop.
Jupyter Accessibility Workshops happening on January 15th and 22nd 2022.