Heard some users wanted higher input limits, so we bumped up the input/output caps to 2,000 characters and 2,000 tokens, respectively.
If the app slows down too much with the extra inference load, we'll add more GPUs. Hopefully, this helps the folks using the app for code.