10420CS 573100 音樂資訊檢索 Music Information Retrieval
Lecture 15 Tagging and Recommendation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/
[email protected]
Music & Audio Computing Lab, Research Center for IT Innovation, Academia Sinica
Outline • Music auto-tagging • Music recommendation
Music Auto-tagging: What • ISMIR 2008 tutorial http://www.slideshare.net/plamere/social-tags-and-musicinformation-retrieval-part-i-presentation [link_1]
Music Auto-tagging: What • What are social tags? • Why do people tag? • Issues with social tags
Read more in the tutorial slides
Music Auto-tagging: Why
http://www.licensequote.com/blog/index.php/2011/06/find-music-search-features-tips/
Music Auto-tagging: Why • Semantic music annotation and retrieval
From Doug Turnbull’s slides
Music Auto-tagging: Why • Tag-colorizing interface (http://slm.iis.sinica.edu.tw/SoTags/)
From Ju-Chiag Wang’s slides
Music Auto-tagging: How • Given:
1
1
∈ : feature representation ∈ 0, 1 : class labels feature
training data
Feature extraction
Model training
Manual annotation
ground truth test data
Feature extraction
feature
model
Automatic Prediction
estimate
Outline • Music auto-tagging Collecting an annotated music corpus Feature extraction and model training
• Music recommendation
Collecting an Annotated Music Corpus: How?
200+ mood terms are adopted in All Music Guide
http://www.allmusic.com/moods
Collecting an Annotated Music Corpus • Reference ‒ https://www.cs.swarthmore.edu/~turnbull/Papers/Turnbull _MusicSearchEngine_Summer07_slides.pdf [link_2]
1. Crawl the web (social tags or editorial tags) +: cheap, tons of data ‒: noisy, weak labeling, long tail
Five approaches to collecting tags for music, ISMIR 2008
Collecting an Annotated Music Corpus • Reference ‒ https://www.cs.swarthmore.edu/~turnbull/Papers/Turnbull _MusicSearchEngine_Summer07_slides.pdf [link_2]
1. Crawl the web (social tags or editorial tags) 2. Lab survey +: high quality ‒: expensive, not scalable
Five approaches to collecting tags for music, ISMIR 2008
Collecting an Annotated Music Corpus • Reference ‒ https://www.cs.swarthmore.edu/~turnbull/Papers/Turnbull _MusicSearchEngine_Summer07_slides.pdf [link_2]
1. Crawl the web (social tags or editorial tags) 2. Lab survey 3. Game-with-a-purpose +: cheap, scalable
Five approaches to collecting tags for music, ISMIR 2008
Game-with-a-purpose (GWAP) • Human-based computation game Completely automated public Turing test to tell computers and humans apart (CAPTCHA)
Game-powered Machine Learning
Game-powered machine learning, PNAS 2012
Collecting an Annotated Music Corpus
Five approaches to collecting tags for music, ISMIR 2008
Active Learning • Music autotagging performance as a function of human effort • “passive” randomly selects songs for annotation • “active” leverages an active learning paradigm • “MGP” (Music Genome Project) expert musicologists at Pandora.com Game-powered machine learning, PNAS 2012
Collecting an Annotated Music Corpus 4. Crowdsourcing (Amazon Mechanical Turk) +: cheap, scalable ‒: need dedicated quality control
Feature Extraction and Model Training • Multi-label classification problem
Slides from http://www.slideshare.net/abifet/presentation-42833362
Feature Extraction and Model Training • Capture label dependence
Slides from http://www.slideshare.net/abifet/presentation-42833362
Feature Extraction and Model Training • Capture temporal information Any given music database Music Tracks Audio Signal From Ju-Chiag Wang’s slides
EM Training …
Global Set F of individual frame vectors randomly selected from each track
N1
N2
NK-1 N4
NK
N3
Global GMM for acoustic feature encoding
Feature Extraction and Model Training • Capture temporal information Tag-based music aspects
N1 N2
2
1
…
NK-1
K-1
K-1
NK
K
K
Global GMM
count
2
…
Feature vectors
…
…
1
Tag label
GMM posterior From Ju-Chiag Wang’s slides
Feature Extraction and Model Training • Audio + text
From Doug Turnbull’s slides
Feature Extraction and Model Training • Feature design (domain knowledge + signal processing) +: physical meaning, scientific value ‒: good feature design hard to come by (often taking several years of research, development and validation)
Moving beyond feature design: Deep architectures and automatic feature learning, ISMIR 2012
Feature Extraction and Model Training • Deep learning: from feature extraction to feature learning +: task (data)-driven, empirically good result, ease of implementation, engineering value ‒: data, interpretability Y. Bengio, Deep Learning, MLSS 2014
Deep Learning • Cascade of multiple layers, composed of a few simple operations Linear algebra, point-wise nonlinearities, pooling Cascaded non-linearities allow for complex systems (the composite of two non-linear systems is an entirely different system)
From Eric J. Humphrey’s slides
Deep Learning • Conventional feature extraction pipeline
From Yann LeCun’s slides
Deep Learning • Deep learning learns feature hierarchies
From Yann LeCun’s slides
Deep Learning Perceptron
Deep neural network
2
1
1
2
)
Deep Learning • Tutorial by Dr. Hung-Yi Lee http://www.slideshare.net/tw_dsconf/ss-62245351 [link_3] http://speech.ee.ntu.edu.tw/~tlkagk/courses.html
• Deep Learning in Music Informatics (ISMIR 2013 tutorial + python tutorial) http://steinhardt.nyu.edu/marl/research/deep_learning_in_music_info rmatics
• Tools http://www.teglor.com/b/deep-learning-libraries-language-cm569/
• Resources http://deeplearning.net/
Deep Learning • Why deep? (fat + short v.s. thin + tall) • Recipe of deep learning Choosing proper loss Mini-batch ReLU Adagrad Momentum From Hung-Yi Lee’s slides
Deep Learning • Recipe of deep learning Early stop Weighted decay Dropout Network structure convolutional neural network (CNN) recurrent neural network (RNN) ultra deep network From Hung-Yi Lee’s slides
Summary • Music auto-tagging Collecting an annotated music corpus crawl the web lab survey game-with-a-purpose crowdsourcing
Feature extraction and model training multi-label classification feature design (deep) feature learning
Outline • Music auto-tagging Collecting an annotated music corpus Feature extraction and model training
• Music recommendation
Music Recommendation: What • Paul Lamere at RecSys 2012 http://www.slideshare.net/plamere/ive-got-10-millionsongs-in-my-pocket-now-what [link_4]
• 10 things special about music recommendation 1. 2. 3. 4. 5.
Very large item space Very low cost per item Low consumption time Highly interactive Very high per-item reuse
6. 7. 8. 9. 10.
Large personal collections Consumed in sequences Highly contextual usage Highly passionate users OMG Metadata
More about Paul Lamere (https://musicmachinery.com/)
Music Recommendation: Why • Engage the indifferents & empower the fans
Music Recommendation: How • RecSys 2014 tutorial http://www.slideshare.net/xamat/recsys-2014-tutorial-therecommender-problem-revisited [link_5]
• Approaches Collaborative filtering (CF) Content-based (CB) Hybrid
Matrix Factorization-based CF
− http://www.slideshare.net/studentalei/matrix-factorization-techniques-forrecommender-systems − http://www.quuxlabs.com/blog/2010/09/matrix-factorization-a-simple-tutorial-andimplementation-in-python/
Matrix Factorization-based CF • Spotify http://www.slideshare.net/MrChrisJohnson/algorithmicmusic-recommendations-at-spotify)
• Explicit feedback (e.g. rating) vs. implicit feedback (e.g. playcounts)
From Chris Johnson’s slides
Explicit Matrix Factorization
Explicit feedback y
From Chris Johnson’s slides
Implicit Matrix Factorization
Explicit feedback y
From Chris Johnson’s slides
How Do We Use the Learned Vectors?
From Chris Johnson’s slides
Cold Start
Music Recommendation: Evaluation • RecSys 2011 tutorial http://www.slideshare.net/plamere/musicrecommendation-and-discovery [link_6]
• What makes a good music recommendation? Relevance Novelty / serendipity Transparency / trust Reach Context
Collaborative Filtering vs Content-based • Collaborative filtering (by measuring item/user similarity in the latent space) Better in relevance
• Content-based (by measuring item similarity in a feature space) Better in novelty, transparency, reach
Music Recommendation: Care and Scale • Brian Whitman's comments http://notes.variogr.am/post/37675885491/how-musicrecommendation-works-and-doesnt-work
Brian Whitman (EchoNest co-founder)
Tristan Jehan (EchoNest co-founder)
Know about the Music • Reads about music lyrics, blog posts, reviews, playlists and discussion forums
• Listens to music tempo, instrumentation, key, time signature, energy, harmonic & timbral structures
• Learns about trends online music behavior — who's talking about which artists this week, what songs are being streamed and downloaded
Know about the Users: Listener Intelligence
From Paul Lamere’s slides
Know about the Users
The neglected user in music information retrieval research, Journal of Intelligent Information Systems, 2013, by Marksu Schedl
Hybrid Approach • Collaborative filtering Higher relevance
• Content-based Higher novelty, transparency, reach
• Hybrid CF, CB + even context-based (Cx)
Tensor Factorization: Adding Contexts
Factorization Machine: A Hybrid Model
From Rendle (2012) KDD Tutorial
Factorization Machine: A Hybrid Model
From Rendle (2012) KDD Tutorial
Summary • Music recommendation Collaborative filtering approach (matrix factorization) Content-based approach Context-based approach Relevance, novelty, transparency, reach, context Cold start Bandit Tests vs A/B Tests