Battle of the Free LLMs: DeepSeek R1 vs ChatGPT o3-mini

Feb 5, 2025

—

in Daily Micro-Blog, Dev, Machine Learning

Yesterday at work, I found myself in a hellhole of test coverage.

You know the drill—writing unit tests for utility functions. I had already covered 80% with the help of some regular AI models. But then I got told to push it to 100% coverage.

Shit. I hate this part.

The Test Coverage Grind

Completing 100% test coverage is no joke, even if you’re using the smartest LLMs available for free.

I spent hours battling edge cases, making sure every branch, condition, and scenario was covered. My test file grew to 1000+ lines.

DeepSeek R1 vs ChatGPT o3-mini: The Showdown

At first, I was using DeepSeek R1—a free reasoning model. It was… okay. It helped, but it still needed multiple attempts to get things right.

Then I switched to ChatGPT o3-mini.

👀 HOLY. SHIT. 👀

Faster? ✅ Yep. Feels instant.
More precise? ✅ Hell yes. No second attempts needed.
Less hallucinations? ✅ Much more reliable.

First attempt—boom, got the correct test case.

The Refactor Test

After getting 100% coverage, I thought: “This test file is bloated as hell.”

So I asked o3-mini to refactor it.

✅ From 1000 lines → 600 lines
✅ Still 100% test coverage

Then I thought, “Wait, did I lose any edge cases?”

So I ran o3-mini again and told it to restore anything important.

✅ Final result: 600 lines, still full coverage.

Final Verdict?

🚀 ChatGPT o3-mini absolutely destroyed DeepSeek R1.

Not even close. Might be o1 vs r1 too, but I didn’t test that one yet.

What’s Your Experience?

Have you tested different LLMs for coding yet? What’s your go-to model for free AI coding assistance?

chatGPT DeepSeek LLM o3-mini r1 tests