諾比讀paper: 6月 2014

2014年6月23日星期一

Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups

Hinton, Geoffrey, et al. "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups." Signal Processing Magazine, IEEE 29.6 (2012): 82-97.

下集待續T____T

Representation Learning: A Review and New Perspectives

Bengio, Yoshua, Aaron Courville, and Pascal Vincent. "Representation learning: A review and new perspectives." Pattern Analysis and Machine Intelligence, IEEE Transactions on 35.8 (2013): 1798-1828.

機器學習表現如何有很大一部分取決於選擇了怎樣的方式來表達資料，
而這篇paper就是關於Representation Learning，也就是，學著去表達資料，
以便於當我們需要建立分類器或者其他的predictors的時候，
可以更簡單，也更有效的去找到資料中有效的資訊

過去的幾年中，包括了:語音辨識與信號處理、object recognition、自然語言處理
、以及了multi-task learning, transfer learning, domain adaptation等領域，Representation Learning的概念都大量的被使用，也得到了很好的結果，這一再得顯示了，我們必須要關心Representation Learning這個概念!

不過，怎麼樣的Representation 才算是一個好的Representation 呢?
通常我們會有以下的這些考量:

smoothness,

這也就是說，當x跟y很接近的時候，我們必須要讓f(x)跟f(y)也很接近，

但是另外一方面，我們會遇到curse of dimention的問題

multiple explanatory factors,

這個概念是建立在distributed representation的概念上，

也就是，我們希望用盡可能少的表達，去展現出大量的原始資訊的構造

a hierarchical organization of explanatory factors,

在日常生活中的物件往往都是具有階層性的構造的，像是橡膠表面與金屬螺絲構成輪胎

而輪胎又是機車或者汽車等不同物件的一個小部分。

而這樣的概念就運用在deep representation上，

deep representation的關鍵就是feature reuse，就像是同樣的金屬螺絲，或者是較大的輪胎等共同部分，而這部分同時也是distributed representation的核心

而其他的部份我們還會有以下的一些關於機器學習上的考量

semi-supervised learning,

shared factors across tasks

manifolds,

natural clustering,

temporal and spatial coherence

sparsity

simplicity of factor dependencies

綜合以上的考量，deep arcitecture似乎是我們的選擇，

因此這篇paper剩下的部分就著重在一些可以運用在深度結構學習的

feature learning 演算法。

大致上可以分成下面三個部分

1. probabilistic models

包含了 directed graphical models : PCA 、sparse coding

以及undirected graphical models : RBM

2.Directly Learning A parametric Map from Input to Representation

也就是我們所謂的auto-encoder

3.Representation learning as manifold learning

2014年6月22日星期日

A Survey on Transfer Learning

	A Survey on Transfer Learning Pan, Sinno Jialin, and Qiang Yang. "A survey on transfer learning." Knowledge and Data Engineering, IEEE Transactions on 22.10 (2010): 1345-1359.

傳統的機器學習通常只在測試數據以及訓練數據來自同樣的特徵空間(feature space)、以及同樣的機率分布下，會有良好的表現。但是在現實生活物中，這樣的假設並不一定總是正確的，而重新在不同的機率分布下蒐集資料是很昂貴的。因此transfer learning便是試圖要解決這一類的問題。