⟩ Tell me why do segmentation CNNs typically have an encoder-decoder style / structure?
The encoder CNN can basically be thought of as a feature extraction network, while the decoder uses that information to predict the image segments by “decoding” the features and upscaling to the original image size.