Python Keras 准确率不会改变

Question

提问by Murat Aykanat

I have a few thousand audio files and I want to classify them using Keras and Theano. So far, I generated a 28x28 spectrograms (bigger is probably better, but I am just trying to get the algorithm work at this point) of each audio file and read the image into a matrix. So in the end I get this big image matrix to feed into the network for image classification.

我有几千个音频文件，我想使用 Keras 和 Theano 对它们进行分类。到目前为止，我生成了每个音频文件的 28x28 频谱图（更大可能更好，但我只是想让算法在这一点上工作）并将图像读入矩阵。所以最后我把这个大图像矩阵输入到网络中进行图像分类。

In a tutorial I found this mnist classification code:

在一个教程中，我发现了这个 mnist 分类代码：

import numpy as np

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense
from keras.utils import np_utils

batch_size = 128
nb_classes = 10
nb_epochs = 2

(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255

print(X_train.shape[0], "train samples")
print(X_test.shape[0], "test samples")

y_train = np_utils.to_categorical(y_train, nb_classes)
y_test =  np_utils.to_categorical(y_test, nb_classes)

model = Sequential()

model.add(Dense(output_dim = 100, input_dim = 784, activation= "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = nb_classes, activation = "softmax"))

model.compile(optimizer = "adam", loss = "categorical_crossentropy")

model.fit(X_train, y_train, batch_size = batch_size, nb_epoch = nb_epochs, show_accuracy = True, verbose = 2, validation_data = (X_test, y_test))
score = model.evaluate(X_test, y_test, show_accuracy = True, verbose = 0)
print("Test score: ", score[0])
print("Test accuracy: ", score[1])

This code runs, and I get the result as expected:

这段代码运行了，我得到了预期的结果：

(60000L, 'train samples')
(10000L, 'test samples')
Train on 60000 samples, validate on 10000 samples
Epoch 1/2
2s - loss: 0.2988 - acc: 0.9131 - val_loss: 0.1314 - val_acc: 0.9607
Epoch 2/2
2s - loss: 0.1144 - acc: 0.9651 - val_loss: 0.0995 - val_acc: 0.9673
('Test score: ', 0.099454972004890438)
('Test accuracy: ', 0.96730000000000005)

Up to this point everything runs perfectly, however when I apply the above algorithm to my dataset, accuracy gets stuck.

到目前为止，一切都运行得很完美，但是当我将上述算法应用于我的数据集时，准确性会卡住。

My code is as follows:

我的代码如下：

import os

import pandas as pd

from sklearn.cross_validation import train_test_split

from keras.models import Sequential
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers.core import Dense, Activation, Dropout, Flatten
from keras.utils import np_utils

import AudioProcessing as ap
import ImageTools as it

batch_size = 128
nb_classes = 2
nb_epoch = 10  


for i in range(20):
    print "\n"
# Generate spectrograms if necessary
if(len(os.listdir("./AudioNormalPathalogicClassification/Image")) > 0):
    print "Audio files are already processed. Skipping..."
else:
    print "Generating spectrograms for the audio files..."
    ap.audio_2_image("./AudioNormalPathalogicClassification/Audio/","./AudioNormalPathalogicClassification/Image/",".wav",".png",(28,28))

# Read the result csv
df = pd.read_csv('./AudioNormalPathalogicClassification/Result/result.csv', header = None)

df.columns = ["RegionName","IsNormal"]

bool_mapping = {True : 1, False : 0}

nb_classes = 2

for col in df:
    if(col == "RegionName"):
        a = 3      
    else:
        df[col] = df[col].map(bool_mapping)

y = df.iloc[:,1:].values

y = np_utils.to_categorical(y, nb_classes)

# Load images into memory
print "Loading images into memory..."
X = it.load_images("./AudioNormalPathalogicClassification/Image/",".png")

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)

X_train = X_train.reshape(X_train.shape[0], 784)
X_test = X_test.reshape(X_test.shape[0], 784)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255

print("X_train shape: " + str(X_train.shape))
print(str(X_train.shape[0]) + " train samples")
print(str(X_test.shape[0]) + " test samples")

model = Sequential()


model.add(Dense(output_dim = 100, input_dim = 784, activation= "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = nb_classes, activation = "softmax"))

model.compile(loss = "categorical_crossentropy", optimizer = "adam")

print model.summary()

model.fit(X_train, y_train, batch_size = batch_size, nb_epoch = nb_epoch, show_accuracy = True, verbose = 1, validation_data = (X_test, y_test))
score = model.evaluate(X_test, y_test, show_accuracy = True, verbose = 1)
print("Test score: ", score[0])
print("Test accuracy: ", score[1])

AudioProcessing.py

音频处理.py

import os
import scipy as sp
import scipy.io.wavfile as wav
import matplotlib.pylab as pylab
import Image

def save_spectrogram_scipy(source_filename, destination_filename, size):
    dt = 0.0005
    NFFT = 1024       
    Fs = int(1.0/dt)  
    fs, audio = wav.read(source_filename)
    if(len(audio.shape) >= 2):
        audio = sp.mean(audio, axis = 1)
    fig = pylab.figure()    
    ax = pylab.Axes(fig, [0,0,1,1])    
    ax.set_axis_off()
    fig.add_axes(ax) 
    pylab.specgram(audio, NFFT = NFFT, Fs = Fs, noverlap = 900, cmap="gray")
    pylab.savefig(destination_filename)
    img = Image.open(destination_filename).convert("L")
    img = img.resize(size)
    img.save(destination_filename)
    pylab.clf()
    del img

def audio_2_image(source_directory, destination_directory, audio_extension, image_extension, size):
    nb_files = len(os.listdir(source_directory));
    count = 0
    for file in os.listdir(source_directory):
        if file.endswith(audio_extension):        
            destinationName = file[:-4]
            save_spectrogram_scipy(source_directory + file, destination_directory + destinationName + image_extension, size)
            count += 1
            print ("Generating spectrogram for files " + str(count) + " / " + str(nb_files) + ".")

ImageTools.py

图像工具.py

import os
import numpy as np
import matplotlib.image as mpimg
def load_images(source_directory, image_extension):
    image_matrix = []
    nb_files = len(os.listdir(source_directory));
    count = 0
    for file in os.listdir(source_directory):
        if file.endswith(image_extension):
            with open(source_directory + file,"r+b") as f:
                img = mpimg.imread(f)
                img = img.flatten()                
                image_matrix.append(img)
                del img
                count += 1
                #print ("File " + str(count) + " / " + str(nb_files) + " loaded.")
    return np.asarray(image_matrix)

So I run the above code and recieve:

所以我运行上面的代码并收到：

Audio files are already processed. Skipping...
Loading images into memory...
X_train shape: (2394L, 784L)
2394 train samples
1027 test samples
--------------------------------------------------------------------------------
Initial input shape: (None, 784)
--------------------------------------------------------------------------------
Layer (name)                  Output Shape                  Param #
--------------------------------------------------------------------------------
Dense (dense)                 (None, 100)                   78500
Dense (dense)                 (None, 200)                   20200
Dense (dense)                 (None, 200)                   40200
Dense (dense)                 (None, 2)                     402
--------------------------------------------------------------------------------
Total params: 139302
--------------------------------------------------------------------------------
None
Train on 2394 samples, validate on 1027 samples
Epoch 1/10
2394/2394 [==============================] - 0s - loss: 0.6898 - acc: 0.5455 - val_loss: 0.6835 - val_acc: 0.5716
Epoch 2/10
2394/2394 [==============================] - 0s - loss: 0.6879 - acc: 0.5522 - val_loss: 0.6901 - val_acc: 0.5716
Epoch 3/10
2394/2394 [==============================] - 0s - loss: 0.6880 - acc: 0.5522 - val_loss: 0.6842 - val_acc: 0.5716
Epoch 4/10
2394/2394 [==============================] - 0s - loss: 0.6883 - acc: 0.5522 - val_loss: 0.6829 - val_acc: 0.5716
Epoch 5/10
2394/2394 [==============================] - 0s - loss: 0.6885 - acc: 0.5522 - val_loss: 0.6836 - val_acc: 0.5716
Epoch 6/10
2394/2394 [==============================] - 0s - loss: 0.6887 - acc: 0.5522 - val_loss: 0.6832 - val_acc: 0.5716
Epoch 7/10
2394/2394 [==============================] - 0s - loss: 0.6882 - acc: 0.5522 - val_loss: 0.6859 - val_acc: 0.5716
Epoch 8/10
2394/2394 [==============================] - 0s - loss: 0.6882 - acc: 0.5522 - val_loss: 0.6849 - val_acc: 0.5716
Epoch 9/10
2394/2394 [==============================] - 0s - loss: 0.6885 - acc: 0.5522 - val_loss: 0.6836 - val_acc: 0.5716
Epoch 10/10
2394/2394 [==============================] - 0s - loss: 0.6877 - acc: 0.5522 - val_loss: 0.6849 - val_acc: 0.5716
1027/1027 [==============================] - 0s
('Test score: ', 0.68490593621422047)
('Test accuracy: ', 0.57156767283349563)

I tried changing the network, adding more epochs, but I always get the same result no matter what. I don't understand why I am getting the same result.

我尝试更改网络，添加更多时期，但无论如何我总是得到相同的结果。我不明白为什么我得到相同的结果。

Any help would be appreciated. Thank you.

任何帮助，将不胜感激。谢谢你。

Edit: I found a mistake where pixel values were not read correctly. I fixed the ImageTools.py below as:

编辑：我发现了一个错误，即像素值未正确读取。我将下面的 ImageTools.py 固定为：

import os
import numpy as np
from scipy.misc import imread

def load_images(source_directory, image_extension):
    image_matrix = []
    nb_files = len(os.listdir(source_directory));
    count = 0
    for file in os.listdir(source_directory):
        if file.endswith(image_extension):
            with open(source_directory + file,"r+b") as f:
                img = imread(f)                
                img = img.flatten()                        
                image_matrix.append(img)
                del img
                count += 1
                #print ("File " + str(count) + " / " + str(nb_files) + " loaded.")
    return np.asarray(image_matrix)

Now I actually get grayscale pixel values from 0 to 255, so now my dividing it by 255 makes sense. However, I still get the same result.

现在我实际上得到了从 0 到 255 的灰度像素值，所以现在我将它除以 255 是有意义的。但是，我仍然得到相同的结果。

Answer 1

采纳答案by TheWalkingCube

The most likely reason is that the optimizer is not suited to your dataset. Here is a list of Keras optimizersfrom the documentation.

最可能的原因是优化器不适合您的数据集。这是文档中的Keras 优化器列表。

I recommend you first try SGD with default parameter values. If it still doesn't work, divide the learning rate by 10. Do that a few times if necessary. If your learning rate reaches 1e-6 and it still doesn't work, then you have another problem.

我建议您首先尝试使用默认参数值的 SGD。如果它仍然不起作用，将学习率除以 10。如有必要，这样做几次。如果您的学习率达到 1e-6 并且仍然不起作用，那么您还有另一个问题。

In summary, replace this line:

总之，替换这一行：

model.compile(loss = "categorical_crossentropy", optimizer = "adam")

with this:

有了这个：

from keras.optimizers import SGD
opt = SGD(lr=0.01)
model.compile(loss = "categorical_crossentropy", optimizer = opt)

and change the learning rate a few times if it doesn't work.

如果它不起作用，请更改学习率几次。

If it was the problem, you should see the loss getting lower after just a few epochs.

如果这是问题所在，您应该会在几个 epoch 后看到损失越来越低。

Answer 2

回答by Murat Aykanat

After some examination, I found that the issue was the data itself. It was very dirty as in same input had 2 different outputs, hence creating confusion. After clearing up the data now my accuracy goes up to %69. Still not enough to be good, but at least I can now work my way up from here now that the data is clear.

经过一番检查，我发现问题出在数据本身上。它非常脏，因为在相同的输入中有 2 个不同的输出，因此造成了混乱。现在清除数据后，我的准确率上升到 %69。仍然不够好，但至少我现在可以从这里开始，因为数据很清楚。

I used the below code to test:

我使用以下代码进行测试：

import os
import sys

import pandas as pd
import numpy as np

from keras.models import Sequential
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers.core import Dense, Activation, Dropout, Flatten
from keras.utils import np_utils

sys.path.append("./")
import AudioProcessing as ap
import ImageTools as it


# input image dimensions
img_rows, img_cols = 28, 28
dim = 1
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
# convolution kernel size
nb_conv = 3

batch_size = 128
nb_classes = 2
nb_epoch = 200

for i in range(20):
    print "\n"

## Generate spectrograms if necessary
if(len(os.listdir("./AudioNormalPathalogicClassification/Image")) > 0):
    print "Audio files are already processed. Skipping..."
else:
    # Read the result csv
    df = pd.read_csv('./AudioNormalPathalogicClassification/Result/AudioNormalPathalogicClassification_result.csv', header = None, encoding = "utf-8")

    df.columns = ["RegionName","Filepath","IsNormal"]

    bool_mapping = {True : 1, False : 0}

    for col in df:
        if(col == "RegionName" or col == "Filepath"):
            a = 3      
        else:
            df[col] = df[col].map(bool_mapping)

    region_names = df.iloc[:,0].values
    filepaths = df.iloc[:,1].values
    y = df.iloc[:,2].values
    #Generate spectrograms and make a new CSV file
    print "Generating spectrograms for the audio files..."
    result = ap.audio_2_image(filepaths, region_names, y, "./AudioNormalPathalogicClassification/Image/", ".png",(img_rows,img_cols))
    df = pd.DataFrame(data = result)
    df.to_csv("NormalVsPathalogic.csv",header= False, index = False, encoding = "utf-8")

# Load images into memory
print "Loading images into memory..."
df = pd.read_csv('NormalVsPathalogic.csv', header = None, encoding = "utf-8")
y = df.iloc[:,0].values
y = np_utils.to_categorical(y, nb_classes)
y = np.asarray(y)

X = df.iloc[:,1:].values
X = np.asarray(X)
X = X.reshape(X.shape[0], dim, img_rows, img_cols)
X = X.astype("float32")
X /= 255

print X.shape

model = Sequential()

model.add(Convolution2D(64, nb_conv, nb_conv,
                        border_mode='valid',
                        input_shape=(1, img_rows, img_cols)))

model.add(Activation('relu'))

model.add(Convolution2D(32, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))

model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(128))
model.add(Activation('relu'))

model.add(Dropout(0.5))

model.add(Dense(nb_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adadelta')

print model.summary()

model.fit(X, y, batch_size = batch_size, nb_epoch = nb_epoch, show_accuracy = True, verbose = 1)

Answer 3

回答by TheTechGuy

Check out this one

看看这个

sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

model.compile( loss = "categorical_crossentropy", 
               optimizer = sgd, 
               metrics=['accuracy']
             )

Check out the documentation

查看文档

I had better results with MNIST

我用 MNIST 获得了更好的结果

Answer 4

回答by Paul Bendevis

If the accuracy is not changing, it means the optimizer has found a local minimum for the loss. This may be an undesirable minimum. One common local minimum is to always predict the class with the most number of data points. You should use weighting on the classes to avoid this minimum.

如果准确率没有变化，则意味着优化器已经找到了损失的局部最小值。这可能是不希望的最小值。一种常见的局部最小值是始终预测具有最多数据点的类别。您应该对类使用权重来避免这个最小值。

from sklearn.utils import compute_class_weight
classWeight = compute_class_weight('balanced', outputLabels, outputs) 
classWeight = dict(enumerate(classWeight))
model.fit(X_train, y_train, batch_size = batch_size, nb_epoch = nb_epochs, show_accuracy = True, verbose = 2, validation_data = (X_test, y_test), class_weight=classWeight)

Answer 5

回答by charel-f

Another solution that I do not see mentioned here, but caused a similar problem for me was the activiation function of the last neuron, especialy if it is reluand not something non linear like sigmoid.

另一个我在这里没有看到但对我造成类似问题的解决方案是最后一个神经元的激活函数，特别是如果它relu不是像sigmoid.

In other words, it might help you to use a non-linear activation function in the last layer

换句话说，它可能会帮助您在最后一层使用非线性激活函数

Last layer:

最后一层：

model.add(keras.layers.Dense(1, activation='relu'))

Output:

输出：

7996/7996 [==============================] - 1s 76us/sample - loss: 6.3474 - accuracy: 0.5860
Epoch 2/30
7996/7996 [==============================] - 0s 58us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 3/30
7996/7996 [==============================] - 0s 58us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 4/30
7996/7996 [==============================] - 0s 57us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 5/30
7996/7996 [==============================] - 0s 58us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 6/30
7996/7996 [==============================] - 0s 60us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 7/30
7996/7996 [==============================] - 0s 57us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 8/30
7996/7996 [==============================] - 0s 57us/sample - loss: 6.3473 - accuracy: 0.5860

Now I used a non linear activation function:

现在我使用了一个非线性激活函数：

model.add(keras.layers.Dense(1, activation='sigmoid'))

Output:

输出：

7996/7996 [==============================] - 1s 74us/sample - loss: 0.7663 - accuracy: 0.5899
Epoch 2/30
7996/7996 [==============================] - 0s 59us/sample - loss: 0.6243 - accuracy: 0.5860
Epoch 3/30
7996/7996 [==============================] - 0s 56us/sample - loss: 0.5399 - accuracy: 0.7580
Epoch 4/30
7996/7996 [==============================] - 0s 56us/sample - loss: 0.4694 - accuracy: 0.7905
Epoch 5/30
7996/7996 [==============================] - 0s 57us/sample - loss: 0.4363 - accuracy: 0.8040
Epoch 6/30
7996/7996 [==============================] - 0s 60us/sample - loss: 0.4139 - accuracy: 0.8099
Epoch 7/30
7996/7996 [==============================] - 0s 58us/sample - loss: 0.3967 - accuracy: 0.8228
Epoch 8/30
7996/7996 [==============================] - 0s 61us/sample - loss: 0.3826 - accuracy: 0.8260

This is not directly a solution to the original answer, but as the answer is #1 on Google when searching for this problem, it might benefit someone.

这不是原始答案的直接解决方案，但由于搜索此问题时答案在 Google 上排名第一，因此可能会使某人受益。

Answer 6

回答by Sonali Dasgupta

I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.

我遇到了类似的问题。在 Keras 中使用 nputils 对目标变量进行 One-hot 编码，解决了准确率和验证损失卡住的问题。使用权重来平衡目标类进一步提高了性能。

Solution :

解决方案：

from keras.utils.np.utils import to_categorical
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)

Answer 7

回答by Farido mastr

I've the same problem as you my solution was a loop instead of epochs

我和你有同样的问题我的解决方案是循环而不是纪元

for i in range(10):
  history = model.fit_generator(generator=training_generator,
                    validation_data=validation_generator,
                    use_multiprocessing=True,
                    workers=6,
                    epochs=1)

你也可以在每个时期保存模型，这样你就可以在你想要的任何时期之后暂停训练

for i in range(10):
  history = model.fit_generator(generator=training_generator,
                    validation_data=validation_generator,
                    use_multiprocessing=True,
                    workers=6,
                    epochs=1)
  #save model
  model.save('drive/My Drive/vggnet10epochs.h5')
  model = load_model('drive/My Drive/vggnet10epochs.h5')

Answer 8

回答by Vijay Makwana

I got 13% Accuracy increment using this 'sigmoid' activation

使用此“sigmoid”激活，我获得了 13% 的准确度增量

model = Sequential()
model.add(Dense(3072, input_shape=(3072,), activation="sigmoid"))
model.add(Dense(512, activation="sigmoid"))
model.add(Dense(1, activation="sigmoid"))

Or you can also test the following, where 'relu' in first and hidden layer.

或者您也可以测试以下内容，其中第一层和隐藏层中的“relu”。

model = Sequential()
model.add(Dense(3072, input_shape=(3072,), activation="relu"))
model.add(Dense(512, activation="sigmoid"))
model.add(Dense(1, activation="sigmoid"))

Answer 9

回答by Doralisa

I had similar problem. I had binary class which was labeled by 1 and 2. After testing different kinds of optimizer and activation functions I found that the root of the problem was my labeling to classes. In the other words I changed the labels to 0 and 1 instead of 1 and 2, then this problem solved!

我有类似的问题。我有一个标记为 1 和 2 的二进制类。在测试了不同类型的优化器和激活函数后，我发现问题的根源在于我对类的标记。换句话说，我将标签更改为 0 和 1 而不是 1 和 2，那么这个问题就解决了！

Python Keras 准确率不会改变

提问by Murat Aykanat

采纳答案by TheWalkingCube

回答by Murat Aykanat

回答by TheTechGuy

回答by Paul Bendevis

回答by charel-f

回答by Sonali Dasgupta

回答by Farido mastr

回答by Vijay Makwana

回答by Doralisa

相关推荐

最近更新

标签

Python Keras 准确率不会改变

提问by Murat Aykanat

采纳答案by TheWalkingCube

回答by Murat Aykanat

回答by TheTechGuy

回答by Paul Bendevis

回答by charel-f

回答by Sonali Dasgupta

回答by Farido mastr

回答by Vijay Makwana

回答by Doralisa

相关推荐

在 Python 中从 CSV 转换为数组

“Python 中的 ssl 模块不可用”

minAreaRect OpenCV [Python] 返回的裁剪矩形

Python 删除 Conda 环境

相关推荐

最近更新

标签