A new open-source inference engine, flash-moe, by Daniel Woods, has successfully run a 400B-parameter Large Language Model on an iPhone 17 Pro, a device with just 12GB of RAM. The project leverages ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results