Diving into Self-Hosted LLM on Windows: My Initial Impressions
I've been experimenting with something new: running self-hosted AI on my
Windows PC. I’m still somewhat early in the process—just getting my feet wet—so this
post won’t be a full tutorial or review just yet. I’ll save that for when I’ve had more
time to dig in and really understand it. If the results have been interesting, and I
plan to do more with it at some point. In the mean time I'll use continue to use online
services such as ChatGPT.
Why Host Your Own AI?
The biggest reason people go down the road of self hosting is
privacy. When you use self-hosted AI, the content you generate stays with you. It
doesn’t get uploaded to the cloud or won't be used to train someone else’s model. That
kind of control is a huge plus if you’re working with sensitive or proprietary
information.
What You'll Need
Before you jump in, be aware: you’ll need some solid hardware to make this work.- I'm personally using an AMD AM4 6800X processor. (Bigger is better)
- I have 64 Gig of DDR 4 memory. (Bigger is better)
- You’ll also need a decent NVIDIA graphics card. The AI models will work much better by utilizing the **CUDA cores** to aid in the processing. I’m using an EVGA 3060 TI FTW, which works pretty well—but there’s still a little lag at times. A better card would improve the experience even more. My card was purchased during the GPU shortage during Covid and Crypto mining so I had to buy what was available at the time.
- The language model I selected is about 4 GB, which is fairly large.
What I Installed
There are three main software components involved in setting this
up:
- OOLAMA – A tool for running large language models locally.
- LLAMA 3.1 – The actual language model.
- WSL (Windows Subsystem for Linux) – This lets you run a Linux environment directly on Windows.
All of these are free to download. I got everything installed without
too much trouble using the online documentation. I can confirm—it does work! I think I
had it running in about 45 minutes. This included the time it took to read the
documentation.
That wasn't bad considering this was all new to me.
First Impressions
I haven’t had a ton of time to play with it in depth. Here’s what I’ve
noticed so far:
- It works similarly to ChatGPT, but the responses feel a bit rougher around the edges.
- You need to phrase things a bit more carefully to get the best results.
- It's a smaller LLM when compared to online services, so your results aren't as comprehensive. There won't be any information on current events. It's not for creating music, video or graphics.
- I','m hopeful that a free LLM will be released at some point that will be in the 40-60 gig range. It will be small enogh to run locally but have a larger and more comprehensive model.
My long-term goal is to use AI to help rewrite my documentation
and parts of my website, to make things clearer and easier to read. If this setup can
help with that, I’ll consider it a win. If the self-hosted doesn't work to my
satisfaction, I will simply keep using ChatGPT or one of the others.
If there is enough interest I can share a more detailed walkthrough. For now, it's been
a fun start into the world of DIY AI!
Royalty Free Pixabay Image