How to Run OpenAI-Like Models Locally

Home » Blog Posts » How to Run OpenAI-Like Models Locally

Last updated 8 months ago

Rated 4.5 / 5 (2 reviews)

As AI becomes more integrated into everyday tools and services, many developers and enthusiasts are interested in using AI models like OpenAI's GPT for tasks such as text generation, chatbots, and more. However, accessing OpenAI's models requires an internet connection to their cloud servers, which isn't always convenient or possible for certain applications. What if you need to run OpenAI-like models offline?

In this post, we will explore how to use OpenAI-compatible models offline, including options like LLaMA and Mistral, which allow you to run powerful language models on your local machine or server without the need for constant internet access.

Why Use AI Models Offline?

There are several reasons why running AI models offline is advantageous:

Data Privacy: When working with sensitive data, you might prefer not to send any information to cloud-based servers.
Cost Efficiency: Offline models eliminate the need for paying API fees or maintaining subscriptions to cloud services.
Flexibility: Running models locally provides complete control over customizations and configurations.
No Internet Required: You can run your applications in environments where internet connectivity is limited or non-existent.

Now, let’s dive into how you can set up and use OpenAI alternatives offline.

To use AI offline, follow these steps:

Step 1. Install Ollama, which can be found at the following link:
https://ollama.com/download

Step 2. Choose the desired model. This can be done at the following link:
https://ollama.com/library

Download Ollama

Step 3. Open command prompt (cmd) and run the command “ollama run <name_model>” and wait for the installation to finish. This command will automatically download the required model files.

ollama run llama3.1

ollama run llama3.2

ollama run deepseek-r1

Download Ollama

💡 Where are models stored?
Windows: C:\Users\UserName\.ollama\models

Step 4. Install AI Requester, which can be found at the following link: https://vovsoft.com/software/ai-requester/
This kind of software can make it easy for users to switch between offline and online modes, providing flexibility depending on their needs.

Step 5. Use the View menu and open "OpenAI Settings". Edit "API URL" to

http://127.0.0.1:11434/v1/chat/completions

That's all! You can now chat with any OpenAI-compatible model locally.

AI Requester

About Author
Fatih Ramazan Çıkan

Software development enthusiast | Electronics engineer

Continue Reading

Responses (3)

Feb 6, 2025 at 09:31 pm (PST) | Reply

Hello, Thanks for the info. I would like to know is it possible to use the Image and Voice sections of the Requester with a local API Key (Generated somehow), or is this something the user will still have to get as a subscription over the internet for these uses?

I had signed up with GPT-o4, Open AI to purchase tokens for the voice use in Vovsoft Text to Speech program, but isn't it true that each time I use their API Key it is being used up and I will eventually have to buy more tokens? Thanks for any helpful info on this. AJ.

Vovsoft Support

Feb 7, 2025 at 01:55 am (PST) | Reply

Hello AJ,

Currently, the Image and Audio sections are only available for OpenAI. You need to pay for OpenAI API credits, not an OpenAI subscription! We will try to support local models for image and audio in the future. By the way, you don't need an API key for local models.

Feb 8, 2025 at 10:08 pm (PST) | Reply

Ok, got it, Thanks much.

Name (required)

E-mail (will not be published)

Vovsoft Newsletter
Updates Giveaways Special Offers

Join 60,000+ other subscribers. And don't worry, we hate spam too! You can unsubscribe at anytime.

How to Run OpenAI-Like Models Locally

Why Use AI Models Offline?

Continue Reading

Responses (3)

Leave a Comment