<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Stackoverflow implicit feedback recommendation system in Intel® Distribution for Python*</title>
    <link>https://community.intel.com/t5/Intel-Distribution-for-Python/Stackoverflow-implicit-feedback-recommendation-system/m-p/544802#M14</link>
    <description>Hi,&lt;P&gt;&amp;nbsp;&lt;/P&gt; We are closing this discussion since we do not handle these types of question in our community.&lt;P&gt;&amp;nbsp;&lt;/P&gt; If you have a question about Intel specific AI frameworks/tools, we would be happy to address your queries.&lt;P&gt;&amp;nbsp;&lt;/P&gt; Thanks &amp;amp; Regards,&lt;P&gt;&amp;nbsp;&lt;/P&gt;Sandhiya</description>
    <pubDate>Fri, 25 May 2018 05:16:35 GMT</pubDate>
    <dc:creator>idata</dc:creator>
    <dc:date>2018-05-25T05:16:35Z</dc:date>
    <item>
      <title>Stackoverflow implicit feedback recommendation system</title>
      <link>https://community.intel.com/t5/Intel-Distribution-for-Python/Stackoverflow-implicit-feedback-recommendation-system/m-p/544800#M12</link>
      <description>&lt;P&gt;I am trying to build a &lt;B&gt;user - item recommendation engine based on the Stackoverflow favourite vote questions&lt;/B&gt;.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;B&gt;The objective:&lt;/B&gt;&lt;P&gt;To build a &lt;B&gt;webpage / IDE&lt;/B&gt; plugin where the user receives his t&lt;B&gt;op N recommended questions&lt;/B&gt; based on:&lt;/P&gt;&lt;P&gt;     - his previous favourite votes on Stackoverflow&lt;/P&gt;&lt;P&gt;     - the programming language he is currently using (this will be a filter using the question tag, ex. only # java questions)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;B&gt;The input data:&lt;/B&gt;&lt;P&gt;I am using the Stackexchange data dump which can be found here: &lt;A href="https://archive.org/download/stackexchange"&gt;https://archive.org/download/stackexchange&lt;/A&gt; stackexchange directory listing; from there I've extracted the data that I thought would be useful:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;     Votes table (each User - Question pair represents a favourite vote for the question from the user):&lt;/P&gt;&lt;P&gt;     &lt;B&gt;UserId - QuestionId&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;     Tags table:&lt;/P&gt;&lt;P&gt;     &lt;B&gt;QuestionId - TagId&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I also have a lot details about each user/question which would make sense in a content-based approach. The only content I used so far are the question tags.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;B&gt;Problems/Properties of the data:&lt;/B&gt;&lt;P&gt;- the data consists of &lt;B&gt;implicit feedback&lt;/B&gt; -&amp;gt; a user either marked a question as favourite or he didn't (binary problem 0/1)&lt;/P&gt;&lt;P&gt;- the data set is &lt;B&gt;quite large,&lt;/B&gt; training and evaluating the a model takes a lot of time (votes CSV file has a few GB)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;Progress so far:&lt;P&gt;So far I've tried a few different approaches, most of them are some sort of &lt;B&gt;collaborative filtering&lt;/B&gt;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;- the first thing I tried was using &lt;B&gt;cosine similarity to get top N question - question  recommendations,&lt;/B&gt; just to test if the results are better than random&lt;/P&gt;&lt;P&gt;- then I've tried using &lt;B&gt;Spark's Alternating Least Squares Matrix Factorisation model&lt;/B&gt; but the results were also mediocre, because I am using implicit feedback data and the ALS technique is built for Explicit Data&lt;/P&gt;&lt;P&gt;- I've also tried using another &lt;B&gt;MF model with Bayesian Personalised Ranking loss function&lt;/B&gt;, which is better suited for implicit data. The library I used here is &lt;B&gt;LightFM&lt;/B&gt; and the metric for evaluation is &lt;B&gt;ROC AUC &lt;A href="https://www.kaggle.com/iancuv/lightfm-demo?scriptVersionId=3670161"&gt;https://www.kaggle.com/iancuv/lightfm-demo?scriptVersionId=3670161&lt;/A&gt; &lt;A href="https://www.kaggle.com/iancuv/lightfm-demo?scriptVersionId=3670161"&gt;https://www.kaggle.com/iancuv/lightfm-demo?scriptVersionId=3670161&lt;/A&gt; &lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;Open questions / suggestions:&lt;P&gt;Do you have any suggestions of some other approaches I should use?&lt;/P&gt;&lt;P&gt;How would you approach this problem?&lt;/P&gt;&lt;P&gt;What preprocessing of the data makes sense to achieve better results?&lt;/P&gt;&lt;P&gt;Is any of the mentioned techniques a good choice for this problem?&lt;/P&gt;&lt;P&gt;Would a only content-based approach make sense?&lt;/P&gt;&lt;P&gt;If yes, how can I improve the results?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I should also mention ( you probably figured it out ) that I'm a CS student, new to the AI/machine learning field. The only applications I've done in the past are related to either simple regression or classification, nothing as complicated as implicit feedback recommendation systems. I know the problem/questions I've mentioned above are very specific but any help is very much appreciated.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;B&gt;Useful links:&lt;/B&gt;&lt;P&gt;&lt;B&gt;&lt;A href="http://lyst.github.io/lightfm/docs/home.html"&gt;http://lyst.github.io/lightfm/docs/home.html&lt;/A&gt; Welcome to LightFM's documentation! — LightFM 1.14 documentation &lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;&lt;A href="https://spark.apache.org/docs/latest/api/python/"&gt;https://spark.apache.org/docs/latest/api/python/&lt;/A&gt; Welcome to Spark Python API Docs! — PySpark master documentation &lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;&lt;A href="https://datasciencemadesimpler.wordpress.com/tag/alternating-least-squares/"&gt;https://datasciencemadesimpler.wordpress.com/tag/alternating-least-squares/&lt;/A&gt; Alternating Least Squares – Data Science Made Simpler&lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;&lt;A href="https://arxiv.org/pdf/1205.2618.pdf"&gt;https://arxiv.org/pdf/1205.2618.pdf&lt;/A&gt; &lt;A href="https://arxiv.org/pdf/1205.2618.pdf"&gt;https://arxiv.org/pdf/1205.2618.pdf&lt;/A&gt; - Bayesian Personalised Ranking MF &lt;A href="http://stanford.edu/~rezab/classes/cme323/S15/notes/lec14.pdf"&gt;http://stanford.edu/~rezab/classes/cme323/S15/notes/lec14.pdf&lt;/A&gt; &lt;/B&gt;&lt;/P&gt;&lt;P&gt;&lt;B&gt;&amp;nbsp;&lt;/B&gt;&lt;/P&gt;&lt;B&gt;&lt;/B&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 23 May 2018 11:55:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Distribution-for-Python/Stackoverflow-implicit-feedback-recommendation-system/m-p/544800#M12</guid>
      <dc:creator>IVerg1</dc:creator>
      <dc:date>2018-05-23T11:55:11Z</dc:date>
    </item>
    <item>
      <title>Re: Stackoverflow implicit feedback recommendation system</title>
      <link>https://community.intel.com/t5/Intel-Distribution-for-Python/Stackoverflow-implicit-feedback-recommendation-system/m-p/544801#M13</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Please note that this forum is primarily intended to address problems &amp;amp; information related to Intel AI frameworks, tools and other offerings like Intel® AI DevCloud.&lt;/P&gt;&lt;P&gt;However, since you have reached out to us, would recommend the following links for reference:&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.aaai.org/Papers/Workshops/2007/WS-07-08/WS07-08-002.pdf"&gt;https://www.aaai.org/Papers/Workshops/2007/WS-07-08/WS07-08-002.pdf&lt;/A&gt; &lt;A href="https://www.aaai.org/Papers/Workshops/2007/WS-07-08/WS07-08-002.pdf"&gt;https://www.aaai.org/Papers/Workshops/2007/WS-07-08/WS07-08-002.pdf&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;A href="http://ijcsit.com/docs/Volume%207/vol7issue4/ijcsit2016070424.pdf"&gt;http://ijcsit.com/docs/Volume%207/vol7issue4/ijcsit2016070424.pdf&lt;/A&gt; &lt;A href="http://ijcsit.com/docs/Volume%207/vol7issue4/ijcsit2016070424.pdf"&gt;http://ijcsit.com/docs/Volume%207/vol7issue4/ijcsit2016070424.pdf&lt;/A&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;A href="https://getstream.io/blog/best-practices-feed-personalization/"&gt;https://getstream.io/blog/best-practices-feed-personalization/&lt;/A&gt; &lt;A href="https://getstream.io/blog/best-practices-feed-personalization/"&gt;https://getstream.io/blog/best-practices-feed-personalization/&lt;/A&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;A href="https://blog.statsbot.co/recommendation-system-algorithms-ba67f39ac9a3"&gt;https://blog.statsbot.co/recommendation-system-algorithms-ba67f39ac9a3&lt;/A&gt; &lt;A href="https://blog.statsbot.co/recommendation-system-algorithms-ba67f39ac9a3"&gt;https://blog.statsbot.co/recommendation-system-algorithms-ba67f39ac9a3&lt;/A&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;A href="https://www.marutitech.com/recommendation-engine-benefits/"&gt;https://www.marutitech.com/recommendation-engine-benefits/&lt;/A&gt; &lt;A href="https://www.marutitech.com/recommendation-engine-benefits/"&gt;https://www.marutitech.com/recommendation-engine-benefits/&lt;/A&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;Thanks &amp;amp; Regards,&lt;P&gt;&amp;nbsp;&lt;/P&gt;Sandhiya</description>
      <pubDate>Thu, 24 May 2018 06:11:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Distribution-for-Python/Stackoverflow-implicit-feedback-recommendation-system/m-p/544801#M13</guid>
      <dc:creator>idata</dc:creator>
      <dc:date>2018-05-24T06:11:25Z</dc:date>
    </item>
    <item>
      <title>Re: Stackoverflow implicit feedback recommendation system</title>
      <link>https://community.intel.com/t5/Intel-Distribution-for-Python/Stackoverflow-implicit-feedback-recommendation-system/m-p/544802#M14</link>
      <description>Hi,&lt;P&gt;&amp;nbsp;&lt;/P&gt; We are closing this discussion since we do not handle these types of question in our community.&lt;P&gt;&amp;nbsp;&lt;/P&gt; If you have a question about Intel specific AI frameworks/tools, we would be happy to address your queries.&lt;P&gt;&amp;nbsp;&lt;/P&gt; Thanks &amp;amp; Regards,&lt;P&gt;&amp;nbsp;&lt;/P&gt;Sandhiya</description>
      <pubDate>Fri, 25 May 2018 05:16:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Distribution-for-Python/Stackoverflow-implicit-feedback-recommendation-system/m-p/544802#M14</guid>
      <dc:creator>idata</dc:creator>
      <dc:date>2018-05-25T05:16:35Z</dc:date>
    </item>
  </channel>
</rss>

