Saturday, October 29, 2016

Week 8: 10/18/2016 - 10/25/2016

This past week I was at the 2016 Grace Hopper Celebration of Women in Computing in Houston, TX. I had a great time networking with companies and graduate schools. I also saw some great talks.

At the career fair I visited many graduate school booth, because I am interested in pursuing a Ph.D. in CS. The recruiter from UC Berkeley that I met at Tapia was there and I was able to give her a few faculty names that she will contact on my behalf. My information will also be passed along to the MIT CS Department (their masters programs were there instead). Purdue wants me to send them a few faculty names, and I had a good experience visiting the University of Illinois's booth.

I am also looking for a summer internship, so I spent some time visiting different companies. I got an interview with Pure Storage after doing their coding challenge. It went so well that they recently invited me to an onsite interview in Mountain View, CA. I also received emails from Twitter and Facebook to start their interview process. Unrelated to GHC, but I have a Google technical interview schedule for Wednesday of next week! I am looking forward to seeing how these opportunities pan out.

I went to some great talks about Machine Learning in production, getting students involved in open source projects, and about conflict resolution. One of the RedHat engineers invited me to the open source panel and the conflict resolution talk. The names of each of the talks are as follows respectively: 7 Hidden Gems: Building Successful Machine-Learning Products, Open Source belongs in your class. Where do you start?, and Constructive Conflict Resolution: or how to make lemonade out of lemons. I really enjoyed the talk about getting students involved in open source projects, because it will be useful information for when I am a faculty member at a university.

I also visited the ACM-W booth and CRA-W booth!

Here are some pictures from the conference:





Wednesday, October 26, 2016

Week 8: 10/18/2016 - 10/25/2016

This week I chose the 3 examples from each subcategories of tasks. In picking the chosen tasks, I tried to get a variety that covered all parts of criteria for the subcategories. After choosing the tasks, I used author tags, and tags we generated as a group to determine oracle (positive) tags. I also came up with the distractor (negative) tags. After I did this I continued, and finished the powerpoint presentation of our tasks. Afterward, I was able to reformat these into google form questions. We are implementing the tasks this way in order to have a multiple answer setup since tobi studio does not support this functionality. I will now being implementing the final tasks in their entirety into tobi studio format.

I was out of town this week so I was not present for our weekly meeting. I reviewed the "Stack Exchange Tagger" on my own, as well as the generated write-up by Alyssa. Like us, this study is reviewing stack exchange auto-tagging/prediction methods. They use title and text to come up with tags and used support vector classification. An important outcome that should be considered in our own study is that user information is important to accuracy of tagging. I added this to our website and made a few other updates to reflect our latest research.

Tuesday, October 25, 2016

Week 8: 10/18/2016 - 10/25/2016

This week I presented 'Stack Exchange Tagger', a student research article that focused on predicting tags using a classifier, which was part of a more general problem of developing accurate classifiers for large scale text datasets. The students took 10,000 questions from StackOverflow and analyzed the text using a Linear Support Vector Classification. The results showed that the linear SVC performed better than all other kernel functions, and the best accuracy obtained from the analysis was 54.75%. The students were able to conclude that the result of the accuracy could have been better if user information had been considered. 

Next week, I will be developing the abstract of our project and we will be closer to submitting to OCWIC, 2017. 


Tuesday, October 18, 2016

Week 7: 10/11/2016 - 10/18/2016

This week I read 'Predicting Tags for StackOverflow Posts' and presented the material at our meeting. The results showed that the tag prediction classifier was 65% accurate on an average of 1 tag per post. The study was designed in mind to help improve user experience on StackOverflow by trying to gather a collection of tags for users looking for a specific solutions to programming problems, and the possible implementation of a tag clean-up system. Next week, I will be continuing to explore possible methods to build our prediction model after we collect our gaze data. 

Week 7: 10/11/2016 - 10/18/2016

This past week I prepared for and participated in a press conference at YSU for Dr. Sharif's NSF CAREER award. She and I spoke about the eye-tracking research we are doing, which includes the work we do with CREU. I spoke about how big of an impact this research has had on my future academic career.  It was an honor to participate in the celebration of Dr. Sharif's achievements, as she has been an excellent mentor and friend.
Here is a publicity video from YSU showing portions of the press conference:

I also took the GRE on Saturday, October 15, 2016, so I spent most of my week preparing for that. My unofficial scores were respectable, but should be a little bit higher. I am planning on taking the GRE again on Tuesday, November 15, 2016. I am excited to be one step closer to finishing my graduate school applications!

I am currently attending the 2016 Grace Hopper Celebration of Women in Computing in Houston, Texas. I am looking forward to a great time and will have pictures to come!

Monday, October 17, 2016

Week 7: 10/11/2016 - 10/18/2016

Continuing through this week with task and experiment creation, we are finishing up our tag assignment of tasks. I have re-categorized/added to the task list according to the criteria I specified. The criteria is as follows:
  • Simple tasks will include content that will be found in CS1 classes; common knowledge of topics such as simple data types, operators, control structures, basic properties of C++ language.
  • Average tasks will include knowledge that is common to someone beyond the CS1 level and gained through experience with programming; specific details of data structures, more involved application of aspects from the simple level.
  • Complex tasks will include applications of more difficult or compound topics; algorithm designs, bit manipulation, using pointers, obscure/intense properties of the C++ language. 
By defining criteria based on specific topics & applications we are able to more clearly separate, and categorize our tasks. Moving forward with this, I begun experiment simulation within Tobi Studio. I have utilized some tools, such as the assessment tool for users to interact with the study (i.e. record confidence levels on tag assignment). One thing we need to work on is task and tag presentation. We need to determine whether or not to allow subjects to view tag suggestions on the same screen/same time as tasks, or to have them move on to the next screen to see tags after reviewing the question. I anticipate there may be some desire for a subject to navigate back and forth if they cannot see the tags as they evaluate. Additionally, we need to decide how tags will be recorded. We are allowing for up to 5 tag suggestions as well as an open ended, user-generated tag suggestion. Since Tobi Studio does not allow for multiple answer, or open ended responses we need to evaluate whether we would like subjects to write down their tag selection versus speaking it out loud or if there are other alternatives. I think this will require research and reflection of previous studies for how methods of recording answers affects attention and gaze-data.

Wednesday, October 12, 2016

Week 6: 10/4/2016 - 10/11/2016

This week I continued working on experiment design, summary, and prototype. As we assign tags to our selected tasks I am currently in the process of defining the criteria for classification of tasks. When we deliver tasks to our subjects, we want to define them as being simple, average or complex, so the concrete reasoning for this categorization is valuable to our study. While I mentioned last week that the two categories of average & complex would be combined, we decided to separate these, again, by establishing concrete criteria for each category. I do believe, however, certain questions may not be as hard for users based on subject background. This is possibly a good point to look into after we have collected our data and begin to analyze, possibly in combination with recorded user confidence. This week we also analyzed a paper, presented by Alyssa, titled 'Towards Predicting the Best Answers in Community Based Question-Answering Services'. This study actually used questions pulled from Stack Overflow, so although their objective differed, it contained elements that related to our own project. The results here included which answers would be best based on their posting date in relation to the posting date of the initial question as well as how the amount of details and length of description in an answer related to accuracy.

Tuesday, October 11, 2016

Week 6: 10/4/2016 - 10/11/2016

This week, I read the article 'Towards Predicting the Best Answers in Community Based Question-Answering Services' and presented the material to our group. The study analyzed a large dataset from StackOverflow based on answer content for a random sample of questions being asked before August of 2012. It was looking to predict if an answer may be selected as the best based on classifier learning from labeled data. The results showed that the more answers from users on StackOverflow an original answer has, the less likely it is to be selected as the best, and that what qualifies a best answer is one with more details and a clear explanation of a solution. Typically, answers that provide an in depth solution would rank the best, and sure enough, users who were not the first to answer the posed question ranked 70.71% accurate.  

I also read through our task list that Ali created and came up with tags that I thought worked best with the questions being asked on StackOverflow. Once our tasks are finalized, we will be using our empirical studies lab to gather participants and ask them to predict tags of their own, or choose from our list of five. 

Week 6: 10/4/2016 - 10/11/2016

This past week I came up with possible tags for each of the StackOverflow questions Ali created for the study we will be conducting, read the paper Towards Predicting the Best Answers in Community-Based Question-Answer Services, finished the second revision of the paper we submitted to YSU's Honors College Journal, and spoke at the Hyland and OCWiC Women in Tech Event about my research and my experiences at OCWiC in 2015.

On Saturday (10/8/2016) I spent the afternoon at Hyland in Westlake, Ohio talking about the eye-tracking research I do with the CREU program and my experience presenting that work at the 2015 Ohio Celebration of Women in Computing. I also highlighted the CREU program as something the women there should consider doing. A picture from the event can be seen to the right.

I also read the paper, Towards Predicting the Best Answers in Community-Based Question-Answer Services. It addressed the problem of predicting if an answer may be selected as the "best" answer, based on learning from labeled data. The contributions of this paper that will be interesting for our CREU work are the designed features that measure key aspects of an answer and the features that heavily influence whether an answer is the "best" answer. They concluded that the best answer is usually the one with more details and comments and the one that is most different than the others. We would like to predict the best answer on StackOverflow documents using eye-tracking features, so this is a beneficial paper to read towards that goal.

Finally, I finished the second revisions for the paper we submitted to the YSU Honors College Journal about the work we performed in the CREU 2015-2016 program. I also created tags I though were approriate for the StackOverflow questions we will be using for our study.

Tuesday, October 4, 2016

Week 5: 9/27/2016 - 10/4/2016

After re-reading all of the articles we have analyzed since the start of our project, I created a summary (using a table) of features for each study in Microsoft Word. It displays the author name, date, dataset, a list of methods used, a list of features and the results of each finding. I then analyzed the sample train data that I had started sorting through last week and have made a list of tags that I find best suits the questions asked by users on StackOverflow. By analyzing the Kaggle dataset, I feel more confident in examining the questions our group has come up with to predict what tags would best fit each question. Once we reconvene and compare our tags, we will be that much closer to setting up our study and gathering participants. 

I also read through the article: "Reading Without Words: Eye Movements in the Comprehension of Comic Strips", and listened to the presentation of its analysis from my group member, Jenna. In the study, participants were asked to view a single comic strip panel, one in order and one randomized, and then a set of 6 panels, both ordered and randomized. The study focused on gaze fixation of each participant and wanted to investigate how disrupting each person's view of reading a comic strip would affect their attention. The conclusion showed that skewing a person's way of viewing a comic strip slows down their comprehension of the narrative. I think analyzing these studies will help us when we are finalizing our experiment of tag prediction. 

Week 5: 9/27/2016 - 10/4/2016

This past week I reread and edited my summary of the paper, Reading Without Words: Eye Movements in the Comprehension of Comic Strips. The goal of the paper was to investigate how disrupting the visual sequence of a comic strip would affect attention. The authors of the paper used eye-tracking data to determine visual attention. They found that disrupting the visual sequence of a comic strip slows down the viewer and makes comprehending the narrative more difficult.

The highlights of the paper that are worth pursuing for our study are the related work by Cohn 2013b, 2013c, and McCloud 1993 that analyze the comprehension of words in sentences and the correlation maps used to compare fixation locations between experiments. The correlation maps are interesting, because they are used as in technique to determine how similar the visual attention is of two different participants on two different panels. For our study it might be useful to determine how similar the visual attention is of two different participants on two different StackOverflow questions. Analyzing the comprehension of words/code in the StackOverflow question would also be useful to our study.

Week 5: 9/27/2016 - 10/4/2016

This week I finished up the list of tasks. We decided on using posts that refer to C++ code only (initially we had discussed both Java and C++) for simplicity. Categorized by difficulty, I found 10 simple tasks and 10 average-complex tasks that would be possibilities for our study. I decided to merge the 2 categories of average and complex since difficulty of tasks will vary between our subjects depending on personal experience. For our experiment design, subjects will need to determine the appropriate tags for a posting. So our next step is to come up with tags for each task in order to build lists of suggestions they may choose from.

Additionally this week we reviewed the paper "Reading Without Words: Eye Movements in the Comprehension of Comic Strips". This was an eye-gaze study using comic strips (animations only, no text) as visual stimuli. The researchers mixed the order of these comic strips and studied how this affects attention, which ultimately resulted in making the comic strips harder to understand which required more attention Correlation maps were then used to compare fixation locations in comic strips. Here, the researchers analyzed their data using t-tests studying variance between experiments (comparing the gaze-data in normal sequence of comics versus randomized sequence of comics).