Human Computer Interaction research on improving peer feedback in online education

The following was conducted during my time as a Human-Computer Interaction research assistant for the Design Lab, headed by Don Norman, Jim Hollan, and Scott Klemmer. This experiment was headed by Catherine Hicks and Vineet Pandey, while Rachel and I created artifacts, helped with experimental design, data code (qualitative), read and found papers, helped with creating interfaces for their experiments, and presented at the UCSD Computer Science & Engineering 2015 Undergraduate Poster Session.

We explored how to improve peer feedback in online classes, collaborating with PeerStudio at Stanford. Peer feedback is necessary, but students' perceptions are that peer feedback is unreliable and unhelpful. How can we help rid of this perception, but also improve peer feedback with online classes?

Deliverables: Poster | Lab Final Presentation

Role: Research Assistant


Peer Feedback is Unreliable but Necessary

So far, MOOCs (massive online open classes) show to be a solution to make education more accessible. [1] [2] [3] While there are still many issues with MOOCs, including a low retention rate, another problem includes the lack of quality in the peer feedback provided in those classes. MOOCs need peer feedback, because too many students require one teacher's attention. However, peer reviews without much instruction lead to grader inaccuracy, inconsistency, and the general perception of peer feedback to students is unreliability. The quality of feedback is low compared to a teacher’s or expert's reviews [1] . With peer feedback, the quality of education in MOOCs drops [1].


*Note: The experiment highlighted uses the formal scientific method for psychology experiments. Although this is the foundation of usability testing, one particular way they differ entails formally choosing a single hypothesis to test and being subjected to critiques after being published. The experiments with a non-successful hypothesis are still considered important (to a certain degree), and need to be documented, which is generally unlike usability testing for products. If you would like to discuss your thoughts on this, please feel free to hit me up!

Like most psychology experiments, the problem space was initially explored by researching other experiments and papers.

Previous Research

Communication and expectations lost online

Communication, norms, and expectations often learned through cues using language, assessments, and social presence in classrooms.

Proposed Method

Context and training provide more support

Experiments on the “framing effect” (changing behavior based on how information is presented, and adjusting the environment accordingly) has hinted applications for peer reviews. Possibly communicating and framing the experience on how an expert would approach the review, and providing more context to the situation would train novices to think more like experts, also known as "context scaffolding".

Hoped Outcome

Higher quality peer reviews and feedback

The hypothesis is that training novices using context scaffolding would help novices think more like and give feedback similar to experts -- giving fundamental principles integral to success. This hopefull would reduce biases, rather than focus on the nonessential parts, therefore increasing the quality of the review, and making it more reliable and trustworthy.

measuring success

Measuring Quality

Other studies hint that the difference between experts and novices are the types of feedback they give, and experts usually focused on deep feedback as opposed to surface feedback.

deep features

parts of the assignment that rest on the fundamental principles that are integral to the work's success

"You could focus on the successes of your previous experience to show the value you brought at your previous job."

surface features

cosmetic, non-essential, and individual choices, such as grammar and word choices

"Make sure to use a more legible font."


Experimental vs. Control

With the experimental group, we gave them training on how an expert would review a resume and provided more context on the job description, helping to frame the participant to think like an expert.

With the control group, we simply gave them basic training on the experiment. At the end, we asked both to give feedback on a resume.


experimental group

  • gave more diverse feedback (deep and surface)
  • gave more positive feedback

control group

  • focused on surface features



More Questions -- What Else?

I think this experiment was greatly thought out and methodical. However, since the nature of quantifying the quality is qualitatives, more questions arised. Of course training would make an impact, but what else could it be? Was it undoubted the training, or simply the sheer amount of information we passed along? Could it even be length of time the reviewer was in the experiment? If they were in the experiment longer, the user could be lead to believe that they should try harder on their peer feedback, increasing the quality. What if we gave bad training? Would the feedback still improve?

I would also separate out the job context with the training into 2 separate experiments, to see what really impacted what. This topic definitely needs more research, which has also been started by many others, like this one.


Be Methodical; Define Success

This was the first psychology experiment I was involved in, where I helped do some of the data coding due to their time constraint. Rachel and I are extremely grateful to be involved in even a small part of the process. Cat and Vineet put in a lot of hard work before they could even start on the experiment by researching and reading ahead of time before they started tackling the problem. You must be prepared and methodical about your research ahead of time. Being a researcher is hard work, and your work is constantly being questioned.

Even as I demonstrated in the improvements section, people will find reasons to poke holes in your research, unsure if the research is truly valid. Your experiment is constantly tweaked to see if the impact is still there, to gain a better understanding, or to present another reason on why the impact occurred. Researchers must be mindful of making claims of causation, when it may be a simple correlation. Putting yourself out there with a published paper is risky, but that's why it so much hard work is put in during the actual experiment. I learned to define and find measurements of success-- this is what truly marks if the impact you desired was made. This is important to the claim you are making with your experiment.

personal lab reflection

Discussing HCI Research

This experience was extremely valuable to me, to see how psychological research can be applied in the HCI realm, and being around so many intelligent and amazing people. It was a great opportunity to listen to people consider different ideas, share their thoughts, and share research on current topics, especially when there are so many complexities and dimensions to research in HCI.

On a personal note, I also learned at the Design Lab to take control and initiative of my own growth, learning, and goals. I can ask for a change in direction if I don't see what I'm currently involved in fitting into my future overall goal, or if I wish to do something different. The brilliant minds there have also taught me to continually be curious, and to continue to ask questions (and the right ones!). And ultimately, never being afraid to fail and to get feedback on your work, because although your ego may hurt a bit at first, it will ultimately be beneficial to hear from different people about your work.

Thank you!

Thank you to everyone at the Design Lab for being open, always willing to help and collaborate, providing us with mentorship, and being a great example. Special thanks to Cat, for being willing to take a chance on a graduating senior who is forever grateful for your willingness to give your time and share your wisdom and knowledge, no matter how short the experience is.