OxfordHack Winners of the Microsoft Cognitive Challenge

Lee Stott · ‎Mar 21 2019

First published on MSDN on Jan 04, 2018

Interview Bot: Using Cognitive Services To Analyse Your Interview Performance

At the end of November, we headed for OxfordHack 2017 , where participants from over 25 countries visited to compete against each other, with companies such as Microsoft, Facebook, JP Morgan and many more sponsoring the event. This article will highlight our experience and explain some of the technical details of our application.

Our Team

Our team, second year students from The University of Manchester :

Samuel Littlefair : BSc Computer Science

Pranay Mistri : BSc Computer Science

Roneel Bhagwakar : BSc Human Computer Interaction

The Concept

Once everyone had grabbed the free goodies and chatted to the sponsors, the impressive presentation by Microsoft during the opening ceremony skewed us towards attempting their challenge: "Most Innovative use of the Microsoft Cognitive API's". Having an opportunity to work with their various machine learning API's was an incentive, not to mention the prize of an Xbox One X each! After 2-3 hours of discussion, trying to solidify an idea, we collectively thought of Interview Bot: a web-based interview preparation application, giving real-time feedback during your video interview. The application reports on your positivity levels, speech sentiment and provides a written transcript of your interview. Since we are all applying for upcoming Internships, the motivation behind this was purely to help prepare others who are in our position, as we were unaware of any existing software that did this. We threw ourselves in at the deep end a bit since none of us had done something like this before, but we were all willing to learn.

In the end, our application implemented 3 Cognitive Services: Emotion API, Bing Speech API and Text Analysis API. The diagram below gives an overview of how our service is structured:

We used a variety of technologies, some of which we'd never tried before. Most of the work is done in JavaScript, jQuery and PHP. We initially started by getting the API's set up, Microsoft generously gave all participants Azure credit so we could use the API's for free. Then we had to get the webcam set up, the first challenge came with how we link a user webcam feed to the Cognitive API's. Eventually, we found the "getUserMedia API", which allows you to take screenshots of a webcam feed, stored in a Base64 png format. We needed to figure out a way to get the image stored on the server, since the Emotion API requires an image URL. It took a while to figure out, but in the end we POST the Base64 png to the server, parse it using PHP, and store it on the server. After this has all completed, we echo back to the client which can now send the API request using the newly stored image, this process repeats every 500ms. The Emotion API conveniently returns JSON data, which we parse using JavaScript and display on a chart.

After we finished that part, it was time to work on Speech-to-Text API, we did not anticipate how challenging this part would be for us. The Microsoft team came to see how we were getting on whilst we were mid-breakdown trying to set up NodeJS, they gave us some ideas how we can implement real-time Speech-to-Text without NodeJS, and a few hours later, we had it working. The transcript is then fed back into the Text Analysis API every sentence, where we can get feedback on the speech sentiments, and report it back to the user.

Overall, getting the API's set up properly took us about 15 hours, we then had some time to work on design and implement extra features. We implemented a login and registration page using a HTML template and our MySQL server, using PHP for parsing and sessions. We also implemented a "key points" feature, allowing the user to write a list of things they want to mention during their interview, which are automatically ticked off the list when mentioned. We then gathered some common Interview questions, and had the program randomly loop through them, and read them out to the user using the HTML5 SpeechSynthesisUtterance API. Unfortunately we ran out of time to add an interview history section, where the user could see their previous practice interviews, but we hope to implement this in the future.

It was quite challenging to get all 3 API's up and running in real-time, as well as parsing and displaying it appropriately, but we just about managed it. Pranay worked on front-end design, layout and branding, whilst Roneel and Sam set up the backend. After a night of no sleep and copious amounts of coffee, we ended up with a working product: Interview Bot asks 12 questions, listens to your speech pattern, analyses your facial expressions in real-time, and gives you a breakdown of your overall score when you're finished, along with a complete transcript.

Demo

Below is a short video showing Interview Bot in action:

After Completion

After a warm meal and a short nap, the closing ceremony commenced, with the organisers mentioning a shortlist of 16 teams from the 57 submissions to give a short presentation and demonstration of their respective concepts. Upon seeing Interview Bot at number 11, our team discussed how we would present and what aspects we would individually be fitting to. With a limit of 4 minutes, Roneel gave a quick introduction which was then followed by a demonstration. Unfortunately, as the web application was not loaded over HTTPS, the demonstration did not work! In the moment we were perplexed as to why this was the case. It was until after the demonstration that we had realised that Chrome doesn't allow webcam usage over a HTTP connection. However, as we demonstrated a working version earlier, and had a video of the working product, the judges were extremely generous. Following this, Sam gave an insight as to what technologies were used and how the three APIs run in parallel, and Pranay finished off with a general conclusion and possible expansion opportunities.

Once all the teams had finished their respective presentations, a few were extremely impressive as they used the technologies well and stuck to the core of the challenge well. We honestly didn’t think we had a chance in winning and this wasn’t helped with the non-functional demonstration earlier on. However, as the Microsoft panel got on stage, they started describing our concept, which cued a few bemused stares between our team, still unsure whether they were talking about us. After announcing that Interview Bot had won their challenge, we were stunned and went to the stage to collect our prizes.

Reflection

24 hours of breakthroughs and breakdowns resulted in an extremely challenging yet fun event. Overall, OxfordHack taught us a significant amount about ourselves, identifying our own strengths and weaknesses, improving our teamwork and communication skills, and most importantly, learning how to use new technologies and languages. The Microsoft challenge allowed us to use professional APIs to develop an idea we thought would be useful to a range of people, which was extremely rewarding.

After speaking with various people throughout the Hackathon and collectively condensing their ideas, the expansion of this concept is almost limitless. Not only can this be a tool for students who can get real-time feedback on their performance, but companies can also use it to assess their candidates’ performance, with the written transcript allowing employers to dissect the interview in detail.

Next Steps

Our team will continue working on Interview Bot, polishing and adding new features, and will be submitting it to the Microsoft Imagine Cup in March.

Imagine Cup

Fancy heading to Seattle for the WW Finals in July 2018! You can register here and the deadline for submitting a video (intro to team and demo of project) and a supporting document (who your team is and what your project is) on the 16 ^th of March 2018.

Final Thanks

We would like to thank OxfordHack for hosting an amazing event and Microsoft for allowing us to use their technology to develop our idea, and their amazing prizes!

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs