About the Author
Cyrus Wong is the senior lecturer of Department of Information Technology of the Hong Kong Institute of Vocational Education (Lee Wai Lee) and he focuses on teaching public Cloud technologies. He is one of the Microsoft Learn for Educators Ambassador and Microsoft Azure MVP from Hong Kong.
Introduction
As an educator, grading tests and assignments is a challenging and time-consuming task. The process involves several steps, such as collecting all completed scripts, reviewing the standard answers and marking for each question, assigning marks, and repeating the process for each student. Once all questions have been marked, the total score for each student must be calculated and entered into a spreadsheet. Finally, the scored scripts are returned to the students.
The current process is quite tedious and involves several unnecessary steps, such as flipping through papers, calculating total marks, and manually entering them into a spreadsheet. Reviewing standard answers and grading each question individually is also quite inefficient, as educators often have to repeat the process until they can memorize the marking scheme. Furthermore, this approach can sometimes result in unfair grading, as educators may not review all student answers for each question at the same time. To address this issue, some educators opt to score each question individually, but this requires flipping through the script multiple times, adding to the workload.
The process is not only physically exhausting but can also result in long-term back and neck pain for educators. Moreover, this process does not necessarily contribute to student learning. The issue at hand could be resolved with the application of machine learning and AI methods. Firstly, automating the process as much as possible would eliminate the need for manual paper flipping. Secondly, utilizing AI processing and analysis would facilitate a more efficient review of students' answers, potentially reducing the workload for educators. Lastly, it would ensure that answers are scored objectively by concealing personal identity.
This version is not reliant on cloud services and can be utilized by all educators through the free GitHub CodeSpaces platform.
Solution
The package consists of four Jupiter Notebooks that serve different purposes.
Highlighting Answers
The first one, question_annotations.ipynb, is used to convert scanned student scripts to jpeg and create bound box to each answer.
For each answer region, educators are required to include bound boxes containing the NAME, ID, and CLASS information.
{
"0": [
{
"x": 236,
"y": 175,
"width": 486,
"height": 78,
"label": "NAME"
},
{
"x": 183,
"y": 255,
"width": 324,
"height": 78,
"label": "ID"
},
{
"x": 820,
"y": 251,
"width": 196,
"height": 82,
"label": "CLASS"
},
{
"x": 157,
"y": 331,
"width": 342,
"height": 127,
"label": "Q1"
}
],
"1": [
{
"x": 203,
"y": 474,
"width": 321,
"height": 142,
"label": "Q36"
},
{
"x": 683,
"y": 480,
"width": 398,
"height": 135,
"label": "Q37"
},
{
"x": 1171,
"y": 483,
"width": 392,
"height": 139,
"label": "Q38"
}
]
}
The result is annotations/annotations.json, and you can edit it directly as well.
Preprocessing and Scoring
The second notebook, scoring_preprocessing.ipynb, validates the student name list and standard answer from Excel, performs OCR on student answer, gets the embeddings for student answer and standard answer, calculates the similarity score, and generates the Scoring Application website.
Index page list all questions
It is necessary for educators to review all questions.
In the case of ID, NAME, and CLASS, the value can be modified rather than inputting a score.
Due to the possibility of OCR errors, the notebook will cross-check with the provided name list.
Educators are required to manually correct the ID by entering the accurate ID.
OCR student’s answer
import easyocr
import tempfile
from PIL import Image, ImageEnhance
import os
easyocrLanguages = ["en"]
reader = easyocr.Reader(easyocrLanguages, gpu=True)
def ocr_image_from_file(image_path, left, top, width, height):
imageFile = tempfile.NamedTemporaryFile(suffix=".png").name
with Image.open(image_path) as im:
# The crop method from the Image module takes four coordinates as input.
# The right can also be represented as (left+width)
# and lower can be represented as (upper+height).
(left, top, right, lower) = (
left,
top,
left + width,
top + height,
)
# Here the image "im" is cropped and assigned to new variable im_crop
im_crop = im.crop((left, top, right, lower))
imageEnhance = ImageEnhance.Sharpness(im_crop)
# showing resultant image
im_crop = imageEnhance.enhance(3)
im_crop.save(imageFile, format="png")
result = reader.readtext(imageFile, detail=0)
easyocrText = "".join(result)
text = easyocrText
os.remove(imageFile)
return text
Retrieve embeddings for both the student's answer and standard answer to compute the similarity score.
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer("all-MiniLM-L6-v2")
def calculate_similarity(answers, question):
# Add the standard answer to the head of list.
if question not in standard_answer:
## return list of 0 in len of answers
return [0] * len(answers)
answers.insert(0, standard_answer[question])
# Compute embeddings
embeddings = model.encode(answers, convert_to_tensor=True)
# Compute cosine-similarities for each sentence with each other sentence
cosine_scores = util.pytorch_cos_sim(embeddings, embeddings)
# Find the pairs with the highest cosine similarity scores
pairs = []
for j in range(0, len(cosine_scores)):
pairs.append(float(cosine_scores[0][j]))
# Empty answer similarity must be 0.
l = list(map(lambda x: (x[0], 0) if x[0] == "" else x, zip(answers, pairs)))
similarties = list(list(zip(*l))[1])
similarties.pop(0)
return similarties
The program uses the Hugging Face all-MiniLM-L6-v2 model, which can be executed on GitHub CodeSpaces at no cost.
The design of scoring web form
Educators establish the maximum score, minimum similarity threshold, and grading granularity. The program will allocate grades to students based on the set criteria.
The design of the form aims to reduce eye movement and eliminate the need for a mouse. Educators can simply use the tab and number pad to navigate through the form.
In the event of an incorrect OCR output, the educator can manually adjust the score.
If students fail to provide an answer, they will receive a score of 0 automatically.
The scanner may alter the position of the scanned image, and educators have the option to modify the bounding box and zoom level. Or click on the image to view the whole page.
The program sends every alteration to the server and stores it as a JSON file. To revert the settings and scores, one can change the name of the file.
After grading is finished, the program will review any questions that were not marked and will double-check the ID to ensure that all submitted assignments from the students have been accounted for.
Postprocessing
The third notebook, scoring_postprocessing.ipynb, backs up the scoring result, generates the score report, creates individual student scored script pdf file, and generates sample.
To ensure quality control, educators must retain a sample. Within our institution, we are required to maintain samples categorized as Good, Average, and Weak.
Return the score result to students
Lastly, email_score.ipynb is used to email individual student scored script pdf files.
Source Code
https://github.com/wongcyrus/AI-Handwrite-Grader/tree/main
Demo and explanation in Cantonese
Conclusion
The traditional process of grading tests and assignments can be a tedious and time-consuming task for educators. However, with the application of machine learning and AI methods, this process can be automated and made more efficient. The scoring package presented in this article offers a solution that utilizes OCR, embedding, and similarity algorithms to automatically score student answers. The program also includes a web form that allows educators to input scores and track grading progress. With this package, educators can reduce the workload of grading and ensure that scoring is objective and fair. Additionally, the program generates a report and individual scored script pdf files that can be emailed to students. This scoring package is a significant improvement over the traditional grading process in 8x times speed and offers a more efficient and objective way of scoring student assignments.
Our team is currently developing a new version of the scoring package that will utilize Microsoft Azure's advanced AI technologies, including Azure AI Document Intelligence, Azure OCR, and Azure OpenAI. This upgrade will significantly improve the accuracy and automate the grading process further, making it even more efficient for educators. By leveraging the power of Azure's AI capabilities, the scoring package will provide a more accurate and reliable way of grading student assignments, ultimately benefiting both educators and students alike.
Project collaborators include Shing Seto, Stanley Leung, Ka Ka Leung, XU YUAN and Hang Ming (Leo) Kwok from the IT114115 Higher Diploma in Cloud and Data Centre Administration.