First up, separating your modulation/CV from your audio generators and modifiers is a really, REALLY bad idea. For one thing, if you wanted to use a logic gate as a waveshaper (which can be quite neat...results in nasty pulse waves!), you'd have to run a long cable all the way to the other cab, connect to the gate, then another long run back to your audio chain. This is just one example, and I know there's tons of others, but the point is that working this way is VERY inefficient and unintuitive. Remember: one of the strongest points to modular synthesis is that there's a certain degree of interchangeability between things that make noise, things that modify noise, and things that make those two things happen. And having everything in one place, as one unitized whole, is a thing that modular is prized for.
Second, if you have a stereo out here, where's the stereo mixer to feed the stereo out with a proper stereo image? I see a lot of HYUUJE (and in a couple of instances, pointlessly so) modules in a tiny 2 x 84 cab, crowding out any of the room you'll need for other things (VCAs, mixers, modifiers, etc etc) that make these big, expensive modules work to their fullest. This promises to turn into a problem pretty quickly if the target here is "3-4 voices"; if I were to use the DTM mixer as the primary qualifier of how many voices you ACTUALLY have, I'd pin that total at "1".
Remember, a "voice" isn't JUST a VCO. It's the signal chain that goes from a VCO, through a VCF for timbral modification, then through a VCA for amplitude modification. You simply don't have that here, ergo you're not even on track to having those 3-4 voices. You might (read: SHOULD) want to rethink what you're doing here, jettison the idea of splitting the modular functions up into separate cabs, and reconsider putting everything where it belongs...in the same case. It'll make what you're up to a lot clearer in the end.

