Thursday, March 30, 2017

Week 26: 3/22/2017 - 3/29/2017

Work Accomplished

This past week I finalized my poster for QUEST titled, Towards Mining Eye-tracking Datasets for Expertise Prediction, and sent it to printing. I also finished the R script to complete step 3 of the data pre-processing steps that merges rows in each task's dataframe based on the fixation index. The hardest part about writing this R script was selecting the correct merging pattern for each data field across each dataset. I also completed a fellowship application to our Phi Kappa Phi chapter at YSU.

Goal

Weekly goal(s) - In the next week, I will be double checking keywords that Alyssa makes for the non-eyetracking keyword selection. I will also be working to gather statistics on the new datasets I created, such as the number of views of each AOI per task per participant and across participants. This should help us narrow down keywords.
Long-term goal(s) - Predict keywords by modifying the process of a propose method for keyword prediction in a Kaggle competition by incorporating eye-tracking. We will also predict keywords without eye-tracking and compare the two keyword sets generated. These keywords inform our tags, so determining them will tell us which pieces of code and/or text in a StackOverflow document are pertinent to tag selection.

Outcome(s)

  1. QUEST poster sent to printing and final draft completed
  2. R script to complete step 3 of the data pre-processing steps written (see above explanation of step 3)
  3. Phi Kappa Phi Fellowship application completed

Wednesday, March 29, 2017

Week 26: 3/22/2017 - 3/29/2017

Work Accomplished: 
This week, I started working on creating keywords AOIs using the data we've collected for our study. I manually examined similar keywords for all 16 participants and determined each one for the 9 tasks. I also uploaded my pre/post survey data to Dropbox under Survey Results. The next step is to take the Train and Test data from a Kaggle competition on Facebook and run it through Apache Spark to test for a keyword analysis. Then, I will take our data and run it through to see if we get similar results. After I export the data and create an excel file with images corresponding to our AOIs, Jenna will run that data through her R script to separate each value. From there, we will analyze the data further and come up with our tag prediction system.


Weekly Goal: Get up to 20 participants for the study. I was able to get 1 more person scheduled, but they were unable to do the study last week.


Future Goal: Take the exported data and compare it against the train and test data from predicting keywords on Kaggle.



Thursday, March 23, 2017

Week 25: 3/15/2017 - 3/22/2017

Accomplishments: 
This week, I wrote up the abstract for my poster at QUEST, Improving Stack Overflow Tag Prediction Using Eye Tracking and Jenna submitted it for me. I also gathered the new participant data and uploaded the pre/post responses to our Dropbox. Next week, I will be in charge of creating keyword AOIs in Tobii Studio and hopefully gathering the last of our remaining participants. 

Weekly Goal:  Gather more participants for our study.

Future Goal: Run data through machine learning algorithm and present my poster at QUEST in April. 




Wednesday, March 22, 2017

Week 25: 3/14/2017 - 3/22/2017

Work Accomplished

This past week I created a draft of my poster for QUEST titled, Towards Mining Eye-tracking Datasets for Expertise Prediction. I also finished the R script to complete steps 1 & 2 of the data pre-processing steps listed below. The hardest part about writing this R script was splitting up the data into files with all participant data per task, because the data is currently per participant where each participant file has the eye-tracking data for all tasks.

Goal

Weekly goal(s) - In the next week, I will be working on step 3 of the data pre-processing steps below, which will be the most difficult of all of the steps, because aggregations are tricky. I will also be presenting the draft of my poster for QUEST as practice for the actual presentation.
Long-term goal(s) - In the next few weeks, I will be working to accomplish the pre-processing steps listed below. (2 of which are already complete)
  1. Create one excel file with the TobiiStudio Field name shown on the left for each participant with all tasks
  2. Create a file for each task with all participant data with relevant experiment field name data columns
  3. Create a file for each task with fixation and duration data merged. All other columns remain as in step 2.

Outcome(s)

  1. QUEST poster draft completed
  2. R script to complete steps 1 & 2 of the data pre-processing steps written

Saturday, March 18, 2017

Week 24: 3/8/2017 - 3/14/2017

Work Accomplished

This past week I worked on writing and submitting my abstract for a poster to QUEST at YSU. My poster will be about mining eye-tracking data using sequential analysis techniques; a presentation of the paper we submitted to MSR a few weeks ago. I also spent time touching base with Dr. Sharif (outside of our regular meeting time) to discuss next steps for our StackOverflow data analysis and I got a clearer picture of the data pre-processing steps we need to complete in the coming weeks:
  1. Create one excel file with the TobiiStudio Field name shown on the left for each participant with all tasks
  2. Create a file for each task with all participant data with relevant experiment field name data columns
  3. Create a file for each task with fixation and duration data merged. All other columns remain as in step 2.
I will be working with Alyssa in the next few weeks to accomplish these steps.

I also found out yesterday that I will be receiving the 2017 NSF Graduate Research Fellowship. So, that was exciting news!!

https://www.fastlane.nsf.gov/grfp/AwardeeList.do?method=loadAwardeeList

Goal

Weekly goal(s) - In the next week, I will be working on step 1 of the pre-processing steps listed above and on my poster for QUEST.
Long-term goal(s) - In the next few weeks, I will be working with Alyssa to accomplish the pre-processing steps listed above.

Outcome(s)

  1. Clear next steps for data pre-processing for the StackOverflow project
  2. QUEST poster abstract submission

Wednesday, March 15, 2017

Week 24: 3/8/2017 - 3/16/2017

Work Accomplished: 

This week, I had to write an R script with the help of Jenna. I am still working on the R script that will be able to process our excel files once I export the data from Tobii Studios. I've collected user pre and post questionnaire data for the users who participate in our Stack Overflow Eye Tracking study. I have those files stored on my computer and I have to upload them to our Dropbox. I also wrote an abstract for YSU's QUEST in April. I attended a digital media conference in New York City over spring break and it gave me perspective from software developers working in newsrooms. The outbreak of digital media will continue to grow and it was interesting to see the software that some of the outlets use to track data. Storyful, a news outlet that gathers data from social platforms around the world, uses a heatmap to store and track news articles/video from around the world. It's refreshing to know that even traditional media outlets are starting to realize the importance of online content and data technology. 

Work Accomplished: Collected more data, submitted QUEST abstract for review. 

Future Work: Present poster at QUEST, finish R script to export user data and submit pre/post questionnaire data to Dropbox.

Thursday, March 9, 2017

Weeks 22 & 23: 2/22/2017 - 3/8/2017

Work Accomplished

I decided to combine this week's and last week's blog post, because I spent last week at OCWiC 2017 and on a graduate school visit and this week is our spring break. So, I was able to make progress on the data analysis for the StackOverflow project this week, while last week I was busy networking. OCWiC 2017 was a lot of fun and a great networking opportunity. I met many female students and faculty throughout Ohio that could be potential collaborators moving forward. I also attended a very informative talk about empowering future female programmers, which will inform our ACM-W chapter's outreach events. Here is a photo of me presenting my poster:


I also spent four days at Carnegie Mellon University on a graduate school visit. There I met many faculty and students in the Institute for Software Research. I really felt like it was the right fit, so I officially committed to CMU this week.

Finally, I wrote an R script to take the StackOverflow study data and sort it by column name for input into our data analysis.

Goal
Weekly goal(s) - In the next week, I will be taking some time off from research on the StackOverflow project for the rest of spring break and until I can touch base with Dr. Lazar and Dr. Sharif on the next steps.
Long-term goal(s) - Perform select data analyses appropriate for answering the following research questions:
  1. To what degree do programmers focus on key words that extraction techniques generate?
  2. To what degree do the top n keywords from our approach and the standard approach match our Oracle generated keywords?
  3. What are the best machine learning algorithms (informed by eye gaze) that can be successfully used to make predictions?
Outcome(s)
  1. Presented my poster at OCWiC 2017 and attending various talks and workshops there
  2. Visited Carnegie Mellon University and made my official decision to attend their PhD program in Software Engineering in the fall
  3. Wrote an R script to sort our StackOverflow data by column name and output to a new file

Week 23: 3/01/2017 - 3/08/2017

Accomplishments: This week, I spent most of my time gathering more data for the eye tracking study. I managed to squeeze in 2 more participants before our spring break. Jenna and I are coinciding on an R script to allow participant data to be searched by column name when given a file. She is working on writing the script and I'm going to review it after spring break. I attended a panel for the ACM-W about gender inclusion in entrepreneurship. I don't see myself opening up my own tech company, but start-ups are populating the area and it's good to receive any insight as to what challenges they may face, or are facing. 


Weekly goal: Gather more participant data to reach at least 20 people

Future goal: Use the R script to loop through participant files and give an analysis of each combined file so that we can use this to inform our machine learning algorithm.


Work accomplished: Received more participant data to analyze, in the process of writing an R script. 

Wednesday, March 1, 2017

Week 22: 2/22/2017 - 3/01/2017

Accomplishments: 
This week, I only had one participant in my eye tracking study. On Friday, I attended the Ohio Celebration of Women in Computing Conference in Sandusky, Ohio with our 'YSU Team'. There, I presented my poster 'Improving Stack Overflow Tag Prediction Using Eye Tracking' to a panel of judges, along with female students from various schools across the United States. It was an opportunity to not only meet potential employers in the industry but develop connections with women in technology who are also conducting similar studies in eye tracking and multi-label classification. 




I attended several of the sessions at the Sawmill Creek Lodge, one of which I found interesting and insightful to my eye tracking study. Cindy Marling, an associate professor of computer science at Ohio University, gave a talk about her research in software that monitors blood glucose levels for those who are diabetic. Marling used a machine learning model for the blood levels, and when asked her process for choosing the best algorithm, she mentioned Weka. I had never heard of this and after some light digging, I found Weka to be a useful tool. It's an open source software that offers a collection of machine learning algorithms specifically for data mining. I will be following some of the tutorials later on once I finish data collection. OCWIC was a great experience and I will encourage all STEM women to get involved early on in their career. 


Weekly Goal: Collect more participants to reach goal of 20 in the next few weeks. 


Future Goal: Export participant data into 7 files and write an R script that will take a file and loop through each one to separate each individual column name of the AOI.