Inference Deployment of Large Language Models