Deep Learning: X-ray classification of COVID-19 Pneumonia

COVID-19 provided an exciting challenge in the world of x-ray image classification. Several papers were published in 2020 creating models to diagnose COVID-19 using a convolutional neural network (CNN) to classify x-ray images as COVID-19 pneumonia, other viral pneumonia, or healthy lungs. Reviewing these papers I was optimistic that I could reproduce these results and improve upon them with a hard voting ensemble model.

I created a classification pipeline to do the following:

  1. Input Kaggle data

  2. Run the data through three high performing models used in other studies (ResNet, DenseNet, and VGG)

  3. Obtain soft predictions on test data

  4. Calculate a threshold for each model that optimizes precision and recall for the target class (COVID-19)

  5. Combine these models in a hard voting ensemble

  6. Assess results of the ensemble model on untouched holdout data

I used TensorFlow for all these transfer learning models and experimented with ImageDataGenerator for data augmentation, batch normalization, dropout, early stopping, up and down-sampling for balanced classes, and freezing and unfreezing layers for multiple rounds of training.

In the end, the results were poor. The models we selected were trained on ImageNet data (regular pictures of objects) where the model benefits from detecting the edges of objects. In this case, x-ray images are black and white and the differences are in cloudiness within the lung. Because this is a very different problem, we probably needed to train the entire model on the new data, rather than a select number of top layers, which we did not have the resources to do. It was also evident that we had problems with overtraining for at least one of our models.

For the full analysis, take a look at the paper below. For the code, feel free to reach out to me directly.

Read paper

Next
Next

Neuron Identification and Classification