In fairness, in-place sorting usually isn't what you want in functional programming. Immutable data structures help enforce the pure functional constraints. Of course, in practice, your library code will probably make a mutable copy, sort that in-place and then return an immutable copy/representation of the result, which is hopefully more efficient. (Disclaimer: I'm no Haskell programmer; my FP experience is limited to Lisp dialects)
EDIT: As a curious example, Clojure's sort function actually converts the sequence to a (Java) Array and calls java.util.Arrays.sort() on it, then turns it back into a sequence:
Advancing compilers is hard. When people argue efficiency as a compiler implementation detail that is going to get worked out, they forget about many who have fallen before them.
You can argue some older languages were written in a way in which it was (reasonably) easy to write a compiler that generates code with little performance overhead when compared to assembly (at worst a factor of 2 to 4, back in the 80s). Some popular languages keep adding abstractions and constructs that the compiler can actually deal with without some new compiler research breakthrough.
Personally of the higher-level languages, I found SBCL to be crazy good at optimizing, of course given a few nudges with a compiler directive or two. By inspecting the dissasembly, I could validate it was doing the "right-thing" (TM), but then again, I do the same to check the C/C++ compiled code.
EDIT: As a curious example, Clojure's sort function actually converts the sequence to a (Java) Array and calls java.util.Arrays.sort() on it, then turns it back into a sequence:
https://github.com/clojure/clojure/blob/b578c69d7480f621841e...