TheanoのDebugメモ - 機械学習・自然言語処理の勉強メモ

Theanoで実装するとき、何ともデバッグが理解しにくいと思った。

numpyでの計算とは違い、計算時にエラーがあったときに何が悪いか突き止めるのが難しい。

使いこなせたらそうじゃないんだろが、初心者には難しく感じた。

なので、備忘録としてTheanoでのDebugについてまとめる。

公式サイトには下記にトラブルシューティングが載っている。
Debugging Theano: FAQ and Troubleshooting — Theano 1.0.0 documentation

エラーメッセージを理解する

公式サイトのエラー例として載っているコード

import numpy as np
import theano
import theano.tensor as T

x = T.vector()
y = T.vector()
z = x + x
z = z + y
f = theano.function([x, y], z)
f(np.ones((2,)), np.ones((3,)))

を実行すると、

Traceback (most recent call last):
...
ValueError: Input dimension mis-match. (input[0].shape[0] = 3, input[1].shape[0] = 2)
Apply node that caused the error: Elemwise{add,no_inplace}(, , )
Inputs types: [TensorType(float64, vector), TensorType(float64, vector), TensorType(float64, vector)]
Inputs shapes: [(3,), (2,), (2,)]
Inputs strides: [(8,), (8,), (8,)]
Inputs scalar values: ['not scalar', 'not scalar', 'not scalar']
HINT: Re-running with most Theano optimization disabled could give you a back-traces when this node was created. This can be done with by setting the Theano flags 'optimizer=fast_compile'. If that does not work, Theano optimization can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint of this apply node.

のエラーメッセージが表示される。

原因はシンプルで、z = z + yで2次元ベクトルと3次元ベクトルを加算していることである。
この場合、

ValueError: Input dimension mis-match. (input[0].shape[0] = 3, input[1].shape[0] = 2)

でわかる。

続きのエラーメッセージを追ってみると、

Inputs types: [TensorType(float64, vector), TensorType(float64, vector), TensorType(float64, vector)]
Inputs shapes: [(3,), (2,), (2,)]
Inputs strides: [(8,), (8,), (8,)]
Inputs scalar values: ['not scalar', 'not scalar', 'not scalar']

という行がある。
これ見て、x,y,zのTensorTypeやshapeが確認できる。

optimizer

theano.config.optimizer=fast_compileとするとエラーとなった箇所を示してくれるらしい。

やってみると、

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "***", line 10, in
z = z + y

と直接のエラーとなった計算部分を示してくれる。

簡単な計算であればこれで問題ないと思われる。

exception_verbosity

エラーメッセージの最後にこんなヒントがあった。

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint of this apply node.

実際に設定してみると、

Debugprint of the apply node:
Elemwise{add,no_inplace} [id A] ''
| [id B] >TensorType(float64, vector)>
| [id B] >TensorType(float64, vector)>
| [id C] >TensorType(float64, vector)>
Storage map footprint:
- , Input, Shape: (3L,), ElemSize: 8 Byte(s), TotalSize: 24 Byte(s)
- , Input, Shape: (2L,), ElemSize: 8 Byte(s), TotalSize: 16 Byte(s)
TotalSize: 40 Byte(s) 0.000 GB
TotalSize inputs: 40 Byte(s) 0.000 GB

というメッセージが表示された。
分かりにくいので、それぞれのシンボルに名前を付けた。

x = T.vector('x')
y = T.vector('y')

その結果がこれ。

Debugprint of the apply node:
Elemwise{add,no_inplace} [id A] ''
|x [id B] >TensorType(float64, vector)>
|x [id B] >TensorType(float64, vector)>
|y [id C] >TensorType(float64, vector)>
Storage map footprint:
- y, Input, Shape: (3L,), ElemSize: 8 Byte(s), TotalSize: 24 Byte(s)
- x, Input, Shape: (2L,), ElemSize: 8 Byte(s), TotalSize: 16 Byte(s)
TotalSize: 40 Byte(s) 0.000 GB
TotalSize inputs: 40 Byte(s) 0.000 GB

Elemwise{add,no_inplace}は加算に関する定義
おそらくその式で、

x
x（コード上zだが、内部ではxシンボルの加算）
y

を用いているというような意味合いだと思う。

そして、最初のエラーメッセージを思い出すと、

Apply node that caused the error: Elemwise{add,no_inplace}(X, X, y)

という箇所があった。
詳しくはDebugprint で詳しく見れるよ、というのがヒントの意味なのだと理解した。

test_value

後、コンパイルはできたけど、実際に関数に与えてエラーが起きるということもよくある。
たとえば、

import numpy
import theano
import theano.tensor as T

# compute_test_value is 'off' by default, meaning this feature is inactive
theano.config.compute_test_value = 'off'
# theano.config.compute_test_value = 'warn'

# configure shared variables
W1val = numpy.random.rand(2, 10, 10).astype(theano.config.floatX)
W1 = theano.shared(W1val, 'W1')
W2val = numpy.random.rand(15, 20).astype(theano.config.floatX)
W2 = theano.shared(W2val, 'W2')

# input which will be of shape (5,10)
x = T.matrix('x')
# provide Theano with a default test-value
# x.tag.test_value = numpy.random.rand(5, 10)

# transform the shared variable in some way. Theano does not
# know off hand that the matrix func_of_W1 has shape (20, 10)
func_of_W1 = W1.dimshuffle(2, 0, 1).flatten(2).T

# source of error: dot product of 5x10 with 20x10
h1 = T.dot(x, func_of_W1)

# do more stuff
h2 = T.dot(h1, W2.T)

# compile and call the actual function
f = theano.function([x], h2)
f(numpy.random.rand(5, 10))

の場合、最後の行で実際に値を入れて初めてエラーが起きる。

そんな時、

import numpy
import theano
import theano.tensor as T

# compute_test_value is 'off' by default, meaning this feature is inactive
# theano.config.compute_test_value = 'off'
theano.config.compute_test_value = 'warn'

# configure shared variables
W1val = numpy.random.rand(2, 10, 10).astype(theano.config.floatX)
W1 = theano.shared(W1val, 'W1')
W2val = numpy.random.rand(15, 20).astype(theano.config.floatX)
W2 = theano.shared(W2val, 'W2')

# input which will be of shape (5,10)
x = T.matrix('x')
# provide Theano with a default test-value
x.tag.test_value = numpy.random.rand(5, 10)

# transform the shared variable in some way. Theano does not
# know off hand that the matrix func_of_W1 has shape (20, 10)
func_of_W1 = W1.dimshuffle(2, 0, 1).flatten(2).T

# source of error: dot product of 5x10 with 20x10
h1 = T.dot(x, func_of_W1)

# do more stuff
h2 = T.dot(h1, W2.T)

# compile and call the actual function
f = theano.function([x], h2)
f(numpy.random.rand(5, 10))

としておくと、
実際に関数を呼ばなくてもテスト値で式が検証できるようになる。