基于caffe的表情识别

跟着别人的博客跑了遍流程,准确率0.61,不算很好,但是也是大概跑个流程,感觉开心的表情识别较好,感觉调参这块还是得自己多实践。

数据集

采用的数据集是fer2013人脸表情数据集。fer2013,即Kaggle facial expression recognition challenge dataset,是目前较大的人脸表情识别公开数据库。

该数据库共包含35887张人脸图片,其中训练集28709张、验证集3589张、测试集3589张。数据库中的图片均为灰度图片,大小为48*48像素,样本被分为0=anger(生气)、1=disgust(厌恶)、2=fear(恐惧)、3=happy(开心)、4=sad(伤心)、5=surprised(惊讶)、6=normal(中性)七类,各种类型分布基本均匀。

数据分布(训练集):angry:3995 、disgust:436 、fear:4097 、happy:7215 、sad:4830 、surprise:3171 、normal:4965

下载数据集:https://pan.baidu.com/s/1i6p40jb

1)准备labels.txt文件,表示分类序号于分类对应关系
在data目录下创建一个空白文档,取名为labels.txt,并输入下面内容:

image

2)准备train.txt,标明训练图片路径及其对应分类,路径和分类序号直接用空格分隔

1
2
3
4
5
6
7
ls train/0 | sed "s:^:0/:" | sed "s:$: 0:" >> train.txt  
ls train/1 | sed "s:^:1/:" | sed "s:$: 1:" >> train.txt
ls train/2 | sed "s:^:2/:" | sed "s:$: 2:" >> train.txt
ls train/3 | sed "s:^:3/:" | sed "s:$: 3:" >> train.txt
ls train/4 | sed "s:^:4/:" | sed "s:$: 4:" >> train.txt
ls train/5 | sed "s:^:5/:" | sed "s:$: 5:" >> train.txt
ls train/6 | sed "s:^:6/:" | sed "s:$: 6:" >> train.txt

image
3)准备val.txt,标明验证图片路径及其对应分类

1
2
3
4
5
6
7
ls val/0 | sed "s:^:0/:" | sed "s:$: 0:" >> val.txt  
ls val/1 | sed "s:^:1/:" | sed "s:$: 1:" >> val.txt
ls val/2 | sed "s:^:2/:" | sed "s:$: 2:" >> val.txt
ls val/3 | sed "s:^:3/:" | sed "s:$: 3:" >> val.txt
ls val/4 | sed "s:^:4/:" | sed "s:$: 4:" >> val.txt
ls val/5 | sed "s:^:5/:" | sed "s:$: 5:" >> val.txt
ls val/6 | sed "s:^:6/:" | sed "s:$: 6:" >> val.txt

4)生成lmdb文件
image
5)计算均值文件
image

搭建网络结构

type:网络中每一层的类型。我们的网络首先是一个Data层,然后紧跟3个卷积层和3个池化层,最后是3个全连接层

kernel_size:卷积核的尺寸

kernel_num:卷积核的个数

stride:步长,即卷积核每次移动的长度

pad:扩充边缘,使得图像经过卷积之后得到的特征图象不会改变尺寸

output:经过该层处理后,输出结果的维度

dropout:减少过拟合

编写 train_val.prototxt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
name: "FacialNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 42
mean_file: "/home/dlnu/faceR/meantrain.binaryproto"
}
data_param {
source: "/home/dlnu/faceR/lmdb/fer2013_train_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 42
mean_file: "/home/dlnu/faceR/meantrain.binaryproto"
}
data_param {
source: "/home/dlnu/faceR/lmdb/fer2013_val_lmdb"
batch_size: 32
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 5
pad: 2
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
pad: 1
kernel_size: 4
stride: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "norm2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "norm3"
type: "LRN"
bottom: "conv3"
top: "norm3"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool3"
type: "Pooling"
bottom: "norm3"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc4"
type: "InnerProduct"
bottom: "pool3"
top: "fc4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2048
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "fc4"
top: "fc4"
}
layer {
name: "drop4"
type: "Dropout"
bottom: "fc4"
top: "fc4"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc5"
type: "InnerProduct"
bottom: "fc4"
top: "fc5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "fc5"
top: "fc5"
}
layer {
name: "drop5"
type: "Dropout"
bottom: "fc5"
top: "fc5"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "fc5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 7
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc6"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc6"
bottom: "label"
top: "loss"
}

编写solver.prototxt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
net: "/home/dlnu/faceR/train_val.prototxt"
test_iter: 110
test_interval: 1000#每迭代1000次进行一次测试
base_lr: 0.001
momentum: 0.9
weight_decay: 0.0005
lr_policy: "fixed"
gamma: 0.1
stepsize: 50000
display: 100
max_iter: 200000
snapshot: 10000
snapshot_prefix: "/home/dlnu/faceR/result"
solver_mode: GPU

训练

1
2
caffe train --solver=~/faceR/solver.prototxt
echo Trainning End

image

编写deploy.prototxt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
name: "FacialNet"
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 10 dim: 1 dim: 42 dim: 42 } }
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 5
pad: 2
stride: 1
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
pad: 1
kernel_size: 4
stride: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "norm2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
stride: 1
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "norm3"
type: "LRN"
bottom: "conv3"
top: "norm3"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool3"
type: "Pooling"
bottom: "norm3"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc4"
type: "InnerProduct"
bottom: "pool3"
top: "fc4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2048
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "fc4"
top: "fc4"
}
layer {
name: "drop4"
type: "Dropout"
bottom: "fc4"
top: "fc4"
dropout_param {
dropout_ratio: 0.4
}
}
layer {
name: "fc5"
type: "InnerProduct"
bottom: "fc4"
top: "fc5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1024
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "fc5"
top: "fc5"
}
layer {
name: "drop5"
type: "Dropout"
bottom: "fc5"
top: "fc5"
dropout_param {
dropout_ratio: 0.4
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "fc5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 7
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "fc6"
top: "prob"
}

编写python程序,测试模型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#coding=utf-8
import sys
sys.path.append("/home/dlnu/caffe/python")

import caffe
import numpy as np

def faceRecognition(imagepath):
root='/home/dlnu/faceR/' #根目录
deploy=root + 'model/deploy.prototxt' #deploy文件
caffe_model=root + 'result_iter_10000.caffemodel' #训练好的 caffemodel
img=root + 'predict/' + imagepath
labels_filename = root + 'data/labels.txt' #类别名称文件,将数字标签转换回类别名称
#net = caffe.Net(deploy,caffe_model,caffe.TEST) #加载model和network
net = caffe.Net(deploy,caffe_model,caffe.TEST)

#图片预处理设置
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape}) #设定图片的shape格式(1,1,42,42)
transformer.set_transpose('data', (2,0,1)) #改变维度的顺序,由原始图片(42,42,1)变为(1,42,42)
#transformer.set_mean('data', np.load(mean_file).mean(1).mean(1)) #减去均值,前面训练模型时没有减均值,这儿就不用
transformer.set_raw_scale('data', 255) # 缩放到【0,255】之间
net.blobs['data'].reshape(1,1,42,42)
im=caffe.io.load_image(img,False)#加载图片
net.blobs['data'].data[...] = transformer.preprocess('data',im)#执行上面设置的图片预处理操作,并将图片载入到blob中

#执行测试
out = net.forward()
labels = np.loadtxt(labels_filename, str, delimiter='\t') #读取类别名称文件
prob= net.blobs['prob'].data[0].flatten() #取出最后一层(Softmax)属于某个类别的概率值,并打印
order=prob.argsort()[-1] #将概率值排序,取出最大值所在的序号
face=labels[order]
return face #face是最终识别的表情

face=faceRecognition('00146.jpg')
print("您的表情是:"+face)

image

参考链接:
https://blog.csdn.net/pangyunsheng/article/details/79481447