One of the key insights is that every program is representable as an integer, an...

One of the key insights is that every program is representable as an integer, and you can map essentially any integer into a program. (If this seems strange, think of a file on a computer--traditionally a stream of bytes--as just a large base-256 integer.) This means we can iterate every possible program, as there are only a countably infinite number of programs.

I should also note that this is required by our conventional definition of program (specifically, a Turing machine). There's definitely models of computation more powerful than Turing machines, but these are not believed to be physically realizable (the Church-Turing Thesis); in any case, the diagonalization form of the proof of the halting problem's uncomputability indicates that these more powerful methods are also susceptible to their own variant of the halting problem.