Cheaper Deep Learning by Transfer Learning Cats to Seismic [SEG Conference 2018]

This years SEG I got the opportunity to present some of my work on transfer learning. In automatic seismic interpretation, progress is preceived as incremental already, although the field has only been established fairly recently. It was shown with new deep neural networks, usually convolutional neural networks, that reproducing human seismic interpretation is possible. This was enough for many to disregard further progress, but the way I see it, we aren’t nearly there yet. The conference paper and this blog post outline how to accelerate scientific progress in automatic seismic interpretation by using pre-trained networks. Thus, reducing training cost and cycle time.

ImageNet and Transfer Learning

The ImageNet challenge, lead by Stanford researcher Fei-Fei Li, is a collection of natural images that have an attached label. These labels are fine-grained. It does not only label dogs, it labels Corgis versus Labradors. The networks that win these competitions are often published, including their training weights. It follows that we can use these networks and the massive training time it took big corporations and research centers to our benefit. Easy examples are creating a dog vs cat classifier network. Clearly a network that knows about Persian cats and Chihuahuas can be abstracted to just predict cats versus dogs. This process is called “transfer learning”. We transfer a pre-trained network to another task.

Oftentimes I heard that “of course” these networks would not transfer to seismic data. Anyone who knows me closer, knows that I can be insufferable, so I sset out to test this “common knowledge”. Can we use transfer learning from natural images to seismic data?

#ImageNet workshop today — a historical talk summarizing 8 years of the dataset and competition. https://t.co/v6MBRCNJ1W pic.twitter.com/AexAUfJvpa
— Fei-Fei Li (@drfeifei) July 26, 2017

Automatic Seismic Interpretation

The paper ” Deep learning seismic facies on state-of-the-art CNN architectures” can be obtained on my personal website https://dramsch.net, as the SEG does not allow preprints. I freeze the complete pre-trained networks, sans the densely connected part and re-train it on seismic images from the MalenoV interpretation. Turns out the VGG network will immediately perform at over 90% accuracy. The learnt filters apply to both natural images and seismic images. As for ResNets, this is not the case, these filters are too specific to natural images. This is congruent with research from other transfer learning papers.

However, in the recent Kaggle competition posed by TGS on salt, we can see the ResNet was used in the winning solution. This uses fine-tuning to slightly alter the pre-learned networks for transfer learning. This assumes that the pre-trained filters of the ResNet are “different” enough from each other so that slight changes should be able to leverage the pre-trained filters.

How to Apply This Knowledge

Just using a ResNet or VGG net and fine-tuning them on seismic data, is a bad idea. It is very slow, therefore people often use Unets https://bit.ly/kaggle-salt and fully convolutional networks. However, luckily we can use the pre-trained networks within these fully convolutional networks. Like I cut off the densely connected part of the network and replaced it with another dense network, one can cut off the dense network and replace it with an embedding layer and a decoder network, as is usual in these network structures for “segmentation”.

We can see that this is exactly what has been done in the Kaggle competition, several pre-trained Encoders were tested and ended up placing very high in verious encoder-decoder configurations. This enables the following benefits to other scientists:

We can train on smaller datasets, as large interpreted seismic cubes may not be available in every research group, smaller interpretations can still be feasible on an encoder that is already very performant.

The VGG filters have been used in many applications in machine learning. Confirming that these apply to seismic data too, is beneficial in that we can use VGG-based similarity measures and other innovations from the machine learning space to further our insights into seismic data even beyond automatic seismic interpretation.

Give Me the Deets!

You may want to confirm my bold statements. Fear not, the code is up on Github to recreate the paper. The paper is available on the SEG website and my personal website. The presentation is available on Figshare, however, my presentation style uses Powerpoint to support the spoken word, which is obviously not available.

Bio
Latest Posts

Jesper Dramsch

PhD Student at The Way of the Geophysicist

... is a geophysicist by heart. He works at the intersection of machine learning and geoscience. He is the founder of The Way of the Geophysicist and a deep learning enthusiast. Writing mostly about computational geoscience and interesting bits and pieces relevant to post-grad life.

Latest posts by Jesper Dramsch (see all)

Juneteenth 2020 - 2020-06-19
All About Dashboards – Friday Faves - 2020-05-22
Keeping Busy – Friday Faves - 2020-04-24