lecture15 tag recom

10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 15 Tagging and Recommendation Yi-Hsuan Yang Ph.D. http://www...

0 downloads 77 Views 4MB Size
10420CS 573100 音樂資訊檢索 Music Information Retrieval

Lecture 15 Tagging and Recommendation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ [email protected]

Music & Audio Computing Lab, Research Center for IT Innovation, Academia Sinica

Outline • Music auto-tagging • Music recommendation

Music Auto-tagging: What • ISMIR 2008 tutorial  http://www.slideshare.net/plamere/social-tags-and-musicinformation-retrieval-part-i-presentation [link_1]

Music Auto-tagging: What • What are social tags? • Why do people tag? • Issues with social tags

Read more in the tutorial slides

Music Auto-tagging: Why

http://www.licensequote.com/blog/index.php/2011/06/find-music-search-features-tips/

Music Auto-tagging: Why • Semantic music annotation and retrieval

From Doug Turnbull’s slides

Music Auto-tagging: Why • Tag-colorizing interface (http://slm.iis.sinica.edu.tw/SoTags/)

From Ju-Chiag Wang’s slides

Music Auto-tagging: How • Given:  

1

1

∈ : feature representation ∈ 0, 1 : class labels feature

training data

Feature extraction

Model training

Manual annotation

ground truth test data

Feature extraction

feature

model

Automatic Prediction

estimate

Outline • Music auto-tagging  Collecting an annotated music corpus  Feature extraction and model training

• Music recommendation

Collecting an Annotated Music Corpus: How?

200+ mood terms are adopted in All Music Guide

http://www.allmusic.com/moods

Collecting an Annotated Music Corpus • Reference ‒ https://www.cs.swarthmore.edu/~turnbull/Papers/Turnbull _MusicSearchEngine_Summer07_slides.pdf [link_2]

1. Crawl the web (social tags or editorial tags) +: cheap, tons of data ‒: noisy, weak labeling, long tail

Five approaches to collecting tags for music, ISMIR 2008

Collecting an Annotated Music Corpus • Reference ‒ https://www.cs.swarthmore.edu/~turnbull/Papers/Turnbull _MusicSearchEngine_Summer07_slides.pdf [link_2]

1. Crawl the web (social tags or editorial tags) 2. Lab survey +: high quality ‒: expensive, not scalable

Five approaches to collecting tags for music, ISMIR 2008

Collecting an Annotated Music Corpus • Reference ‒ https://www.cs.swarthmore.edu/~turnbull/Papers/Turnbull _MusicSearchEngine_Summer07_slides.pdf [link_2]

1. Crawl the web (social tags or editorial tags) 2. Lab survey 3. Game-with-a-purpose +: cheap, scalable

Five approaches to collecting tags for music, ISMIR 2008

Game-with-a-purpose (GWAP) • Human-based computation game Completely automated public Turing test to tell computers and humans apart (CAPTCHA)

Game-powered Machine Learning

Game-powered machine learning, PNAS 2012

Collecting an Annotated Music Corpus

Five approaches to collecting tags for music, ISMIR 2008

Active Learning • Music autotagging performance as a function of human effort • “passive” randomly selects songs for annotation • “active” leverages an active learning paradigm • “MGP” (Music Genome Project) expert musicologists at Pandora.com Game-powered machine learning, PNAS 2012

Collecting an Annotated Music Corpus 4. Crowdsourcing (Amazon Mechanical Turk) +: cheap, scalable ‒: need dedicated quality control

Feature Extraction and Model Training • Multi-label classification problem

Slides from http://www.slideshare.net/abifet/presentation-42833362

Feature Extraction and Model Training • Capture label dependence

Slides from http://www.slideshare.net/abifet/presentation-42833362

Feature Extraction and Model Training • Capture temporal information Any given music database Music Tracks Audio Signal From Ju-Chiag Wang’s slides

EM Training …

Global Set F of individual frame vectors randomly selected from each track

N1

N2

NK-1 N4

NK

N3

Global GMM for acoustic feature encoding

Feature Extraction and Model Training • Capture temporal information Tag-based music aspects

N1 N2

2

1



NK-1

K-1

K-1

NK

K

K

Global GMM

count

2



Feature vectors





1

Tag label

GMM posterior From Ju-Chiag Wang’s slides

Feature Extraction and Model Training • Audio + text

From Doug Turnbull’s slides

Feature Extraction and Model Training • Feature design (domain knowledge + signal processing) +: physical meaning, scientific value ‒: good feature design hard to come by (often taking several years of research, development and validation)

Moving beyond feature design: Deep architectures and automatic feature learning, ISMIR 2012

Feature Extraction and Model Training • Deep learning: from feature extraction to feature learning +: task (data)-driven, empirically good result, ease of implementation, engineering value ‒: data, interpretability Y. Bengio, Deep Learning, MLSS 2014

Deep Learning • Cascade of multiple layers, composed of a few simple operations  Linear algebra, point-wise nonlinearities, pooling  Cascaded non-linearities allow for complex systems (the composite of two non-linear systems is an entirely different system)

From Eric J. Humphrey’s slides

Deep Learning • Conventional feature extraction pipeline

From Yann LeCun’s slides

Deep Learning • Deep learning learns feature hierarchies

From Yann LeCun’s slides

Deep Learning Perceptron

Deep neural network

2

1

1

2

)

Deep Learning • Tutorial by Dr. Hung-Yi Lee  http://www.slideshare.net/tw_dsconf/ss-62245351 [link_3]  http://speech.ee.ntu.edu.tw/~tlkagk/courses.html

• Deep Learning in Music Informatics (ISMIR 2013 tutorial + python tutorial)  http://steinhardt.nyu.edu/marl/research/deep_learning_in_music_info rmatics

• Tools  http://www.teglor.com/b/deep-learning-libraries-language-cm569/

• Resources  http://deeplearning.net/

Deep Learning • Why deep? (fat + short v.s. thin + tall) • Recipe of deep learning  Choosing proper loss  Mini-batch  ReLU  Adagrad  Momentum From Hung-Yi Lee’s slides

Deep Learning • Recipe of deep learning  Early stop  Weighted decay  Dropout  Network structure  convolutional neural network (CNN)  recurrent neural network (RNN)  ultra deep network From Hung-Yi Lee’s slides

Summary • Music auto-tagging  Collecting an annotated music corpus  crawl the web  lab survey  game-with-a-purpose  crowdsourcing

 Feature extraction and model training  multi-label classification  feature design  (deep) feature learning

Outline • Music auto-tagging  Collecting an annotated music corpus  Feature extraction and model training

• Music recommendation

Music Recommendation: What • Paul Lamere at RecSys 2012  http://www.slideshare.net/plamere/ive-got-10-millionsongs-in-my-pocket-now-what [link_4]

• 10 things special about music recommendation 1. 2. 3. 4. 5.

Very large item space Very low cost per item Low consumption time Highly interactive Very high per-item reuse

6. 7. 8. 9. 10.

Large personal collections Consumed in sequences Highly contextual usage Highly passionate users OMG Metadata

 More about Paul Lamere (https://musicmachinery.com/)

Music Recommendation: Why • Engage the indifferents & empower the fans

Music Recommendation: How • RecSys 2014 tutorial  http://www.slideshare.net/xamat/recsys-2014-tutorial-therecommender-problem-revisited [link_5]

• Approaches Collaborative filtering (CF) Content-based (CB) Hybrid

Matrix Factorization-based CF

− http://www.slideshare.net/studentalei/matrix-factorization-techniques-forrecommender-systems − http://www.quuxlabs.com/blog/2010/09/matrix-factorization-a-simple-tutorial-andimplementation-in-python/

Matrix Factorization-based CF • Spotify  http://www.slideshare.net/MrChrisJohnson/algorithmicmusic-recommendations-at-spotify)

• Explicit feedback (e.g. rating) vs. implicit feedback (e.g. playcounts)

From Chris Johnson’s slides

Explicit Matrix Factorization

Explicit feedback y

From Chris Johnson’s slides

Implicit Matrix Factorization

Explicit feedback y

From Chris Johnson’s slides

How Do We Use the Learned Vectors?

From Chris Johnson’s slides

Cold Start

Music Recommendation: Evaluation • RecSys 2011 tutorial  http://www.slideshare.net/plamere/musicrecommendation-and-discovery [link_6]

• What makes a good music recommendation?  Relevance  Novelty / serendipity  Transparency / trust  Reach  Context

Collaborative Filtering vs Content-based • Collaborative filtering (by measuring item/user similarity in the latent space)  Better in relevance

• Content-based (by measuring item similarity in a feature space)  Better in novelty, transparency, reach

Music Recommendation: Care and Scale • Brian Whitman's comments  http://notes.variogr.am/post/37675885491/how-musicrecommendation-works-and-doesnt-work

Brian Whitman (EchoNest co-founder)

Tristan Jehan (EchoNest co-founder)

Know about the Music • Reads about music lyrics, blog posts, reviews, playlists and discussion forums

• Listens to music tempo, instrumentation, key, time signature, energy, harmonic & timbral structures

• Learns about trends online music behavior — who's talking about which artists this week, what songs are being streamed and downloaded

Know about the Users: Listener Intelligence

From Paul Lamere’s slides

Know about the Users

The neglected user in music information retrieval research, Journal of Intelligent Information Systems, 2013, by Marksu Schedl

Hybrid Approach • Collaborative filtering  Higher relevance

• Content-based  Higher novelty, transparency, reach

• Hybrid CF, CB + even context-based (Cx)

Tensor Factorization: Adding Contexts

Factorization Machine: A Hybrid Model

From Rendle (2012) KDD Tutorial

Factorization Machine: A Hybrid Model

From Rendle (2012) KDD Tutorial

Summary • Music recommendation Collaborative filtering approach (matrix factorization) Content-based approach Context-based approach Relevance, novelty, transparency, reach, context Cold start Bandit Tests vs A/B Tests