This isn't the right approach. For one thing, modular samplers tend to be rather limited (by space, storage, and current draw) when compared to a 15-20 year old Akai. Those work differently; you have multisample capability because the RAM isn't as limited, plus some (the late E-Mu ones) offer some interesting resampling capabilities. Contrast that with something like the 1010 Bitbox, where you basically have sixteen monophonic samplers under a global control set and the sample triggers.
What I would suggest would actually be one of the late E-Mu units (E5000, ESI4K) or later Akai (5000 or, if you can find one with the panel, 6000), then stacking this with as much RAM as it'll handle. Then just leave the sampler's OS on its internal drive, and for all of the sample data, get a Gotek or Nalbantov FD replacer so that you can use large SD cards or thumb drives for your sample files as well as any patch save data. Once you've got, say, 256 GB or more for those with rapid media access speeding up the sample load/save process, you've got a seriously hot-rodded hardware sampler.
Now, here's how you factor the modular back into this...
Instead of creating a build that does ALL of the sampling work, create one that ONLY works as a controller system for the sampler. This would have a sequencer, the usual array of modulation sources and cohort modules...but NO sound generation capabilities. Instead, this build needs to use Silent Way (if on PC), Volta (Mac only) or Ableton Live's CV Tools (either) to translate and/or process the outgoing CV/gate/trig signals from the modular, converts this to the appropriate sysex calls, and sends them back out to the sampler WITH the potential for also adding further complexities via the computer's capabilities. Beefier, more robust, and more open-ended.