Model parallelism in neural networks involves partitioning and distributing the model over multiple compute devices to meet computational demands efficiently.