This week, I analyzed and presented a dataset from a group of researchers in a contest sponsored by Kaggle. I read through Galina Lezina and Artem Kuznetsov's published work, 'Predict Closed Questions on StackOverflow.' Their task involved examining public and private datasets from a group of user, post and tag features in StackOverflow and building a classifier that would predict whether or not a question will be closed based on a reason for a question being closed through a user vote. They had hoped their research paper would ease the task of moderating these posts on the StackExchange server through automation.
The results indicated that for the method used (an algorithm called vowpal wabbit), user interaction features worsened the outcome for a small value. Text features contributed more to this result. They also noted that some questions' status is open but should be closed in reality.
I think that even though the baseline models for data provided by Kaggle excluded actual content of each post, each user that examines a question on StackOverflow relies on that content for context if he or she can't understand what is being asked. As the results of the study showed, text features were informative and this holds true in reality. How can one answer a question fully without some reference as to what is being asked on StackOverflow?
As a StackOverflow user, I hope that we are able to predict these outcomes soon with these kind of classifiers.
On Sunday, I attended the Silly Science Sunday event sponsored by OH WOW! in Youngstown. I helped Dr. Sharif and Ali gather participants to demo our eye tracking game as well as demonstrate a simple game that used code trace on small robots. We had a great turnout at the event, and as always, it's great to inspire young minds with the lighter side of our technology!
No comments:
Post a Comment