output modules are mostly superfluous - mostly attenuators or vcas or mixer output or nothing will work perfectly as final output
if you NEED balanced outputs then buy an output module that has balanced outputs
if you NEED headphones out of the modular - then buy a headphone module (ALM HPO for example)
to be honest I wouldn't worry too much about mono/stereo at the moment - most modules are mono (including the 2 you have)
however, if you still want a stereo output from your modular, then you will need a module that can take mono signals and either stereo-ize them (some fx modules will do this) or pan the mono signal into the stereo field (a panning module - or you could patch it using 2 vcas and probably maths)
a lot of 'stereo' mixers offer mono inputs that they copy to both L & R equally - so still mono, but from 2 speakers
there are also quite a few that have panning - either manual or voltage controlled
this is the way I thin I would go if I were you, but only after more important modules - such as vcas, modulation source, some basic utilties (kinks, links, shades - or equivalents) and at the very least a filter - probably also a reverb or delay
in fact I don't think I would bother with this until you are ready to add more voices
I would just add the maths and the vca and a way to play it - either via a keyboard, a sequencer or a midi interface - if the one you want is unavailable - just find something cheap/used that will do the job for now - and get the one you want when it is available and either sell the stopgap or keep it for backup / additional sequencing duties
and work my way through the maths illustrated manual a good few times - this will help massively in your learning - it will show you what functions you want - you then have to work out if you need to add a module to cover one or more of those functions or if you want to add another modulation source - a disting would also be a good way of doing this - but, unless you get the ex, has the same restrictions as maths - that it can only really do one thing at a time
for midi converters as with mixers go for one with more channels than you think you'll need - the chances are that you will need them at some point in the future - for a mixer (with built in vcas, manual panning and headphone outs) a really cost effective solution is the tesseract modular tex-mix which is expandable - you just buy a master module and as many 4 channel mono or stereo modules as you want - you can even get direct outs - which are post fader! this is what I use - most of the functionality of the WMD performance mixer, but for a lot less money - knobs instead of faders and no vc-panning  - but big deal - I have 8 mono channels and 4 stereo channels for somewhere around 300€
"some of the best base-level info to remember can be found in Jim's sigfile" @Lugia
Utility modules are the dull polish that makes the shiny modules actually shine!!!
sound sources < sound modifiers < modulation sources < utilities