Road network and building footprint extraction is essential for many
applications such as updating maps, traffic regulations, city planning,
ride-hailing, disaster response \textit{etc}. Mapping road networks is
currently both expensive and labor-intensive. Recently, improvements in image
segmentation through the application of deep neural networks has shown
promising results in extracting road segments from large scale, high resolution
satellite imagery. However, significant challenges remain due to lack of enough
labeled training data needed to build models for industry grade applications.
In this paper, we propose a two-stage transfer learning technique to improve
robustness of semantic segmentation for satellite images that leverages noisy
pseudo ground truth masks obtained automatically (without human labor) from
crowd-sourced OpenStreetMap (OSM) data. We further propose Pyramid
Pooling-LinkNet (PP-LinkNet), an improved deep neural network for segmentation
that uses focal loss, poly learning rate, and context module. We demonstrate
the strengths of our approach through evaluations done on three popular
datasets over two tasks, namely, road extraction and building foot-print
detection. Specifically, we obtain 78.19\% meanIoU on SpaceNet building
footprint dataset, 67.03\% and 77.11\% on the road topology metric on SpaceNet
and DeepGlobe road extraction dataset, respectively.