Bijan Bowen
DeepSeek V3.2 First Look & Testing – The BEST Open Source Model!
2025-12-02 22min 25,376 views watch on youtube →
Channel: Bijan Bowen
Date: 2025-12-02
Duration: 22min
Views: 25,376
URL: https://www.youtube.com/watch?v=QWTN0G1mv9A

Timestamps:

00:00 - Intro

00:55 - First Look

01:38 - Technical Look

02:48 - Verbose Browser OS Test

07:12 - 3D Printer Simulation Test

10:42 - 3D Flight Simulator Test

14:21 - C++ Racing Game Test

16:05 - Refusal Test

16:21 - Comedy Test

16:46 - Python 3D FPS Test

18:23 - PC Repair & More Website Test

20:49 - Poetry Test

21:15 - Closing Thoughts

AI Integration & Consulting: https://bijanbowen.com

Join the Discord: https://discord.gg/hfaR2exy7S

In this video, we take a look at the newly releas

specific aircraft we're using. As we saw in the beginning, it did. [laughter] Deepseek has released a new model, V3.2, which is designed to be the official successor to V3.2-EXP or experimental, which did come out a little over 2 months ago. However, something of interest is that they also released V3.2-p 2-p special which is something I have only heard reserved for like special Italian sports cars but now DeepS has a special of their own and that is designed to push the boundaries of reasoning capabilities. It is currently API only and seems to be in some form of kind of a research beta period where folks will be able to access it. They do have these specific kind of updates or instructions on how to access that via a temporary endpoint right here that will be good until December 15th. So that is something definitely to note for those who are interested in trying something a little more experimental that is kind of cool to see. Now with that let's just jump right into it. I'm going to try to keep the introduction to this a bit concatenated because honestly it's very

enjoyable to test the models. But look at this chart. So we have DeepSeek V3.2-pi v3.2-IN and it is being compared with the state-of-the-art models from the labs such as OpenAI, Google and Enthropic. Now to note, they don't have claude 4.5 Opus here. It is just Sonet. But regardless of that, it is being compared with Gemini 3 Pro, which did just come out and was kind of the top dog. It's just very cool to see that there is an open- source model that is at least based on this chart comparable to these closed source and expensive models. It just is good for the end user, I guess could be said. And from what we can see right here, DeepSeek has basically said, so V3.2 2 is designed to be your daily driver at GPT5 level performance, while V3.2-p speciality has maxed out reasoning capabilities and is more of a rival to Gemini 3 Pro. Regardless, it is just kind of cool to see um all the competition. I think it's enjoyable. I guess [laughter] they also mention something here, thinking and tool use. So, basically

V3.2 2 is their first model to integrate thinking directly into tool use and also supports tool use in both thinking and non-thinking modes. Make of that what you will, but personally I feel like some of those chain of thoughts could be interesting to read through if you were trying to make it do some more odd tasks. Now, something that did trip me up is they had mentioned the speciality model was only accessible through API for now, which to me I thought like, oh, okay, this one's not open source, but they do have both models hyperl down here, at least to the hugging face page, which is of course where you can actually go ahead and download these models. Now, being this massive amount of 685 billion parameters, I believe one would have to have a couple million to buy some desktop DDR5 RAM right now to actually be able to run this locally. So, we will just be testing it through the web. So, for the first test, we are going back to our roots as enough folks had said they like this test as well, which made me very happy and warm feeling inside. Sorry, I'm in like a mood, but that's a good thing. So, we're going to do the browser based operating system simulation. As we saw here, we do have the thinking enabled just so we can get

some max performance, but keep in mind that this is not the special model. So, that is kind of a separate entity, maybe warranting its own special testing video using that API. We can see that it only thought for 3 seconds, which is probably part of the course considering this is kind of an outdated prompt at this point in testing raw capabilities. But regardless of that, let's just allow it to go. All right, we have our browser OS, and it did seem to have a fairly robust length script. So, what I meant to say there is there was a lot of code that was outputed here. All right, again, we have the Pacific Northwest style background image, which is something I have not seen in a while, but it's nice to see. It's like seeing an old friend at this point. Overall, this does look pretty good just in first glance. There is a bit of gradient to the actual menu bars here, unless that's transparency, and it's just being affected by being in a darker I don't know, but it's cool. And then we have some translucence to the actual file info. I will try to keep these tests a bit shorter than normal, at least in terms of the browser OS one. Okay, we have a welcome, which opens in the

notepad. So, these are the default applications that did open in the browser, which is pretty nice. So, I'll close that. Um interesting kind of double scroll bar down in the bottom right we have a clock that is correctly showing the time in my local as well as interesting. So when we actually click on the volume control or the Wi-Fi we actually do get a popup just saying these are where these would actually appear. So that's cool to see. Most times those are never actually clickable. Speaking of clickable, we do have a right click. All right. It's nice to see that. So we have refresh new text file new folder and properties. I will actually explore some of those. Okay, so the file explorer just has some static things right here. And obviously we have fake um files, but I will note that the icons do correlate with the type of thing that is being displayed, such as the song having a little speaker emoji, the PDF sering PDF. So that's nice to see. Um, is that a rocket on the start menu? I think it is. Browser OS. So very simple and a tame name. We have settings. Calculator. All right. All

right. Let's see what's up. 82 * 6 496. [sighs] I stopped and wondered where had I gone wrong. My arithmetic used to be such a strength but now served to only embarrass me on camera. Perhaps I should just cut that snippet from the video. All right, so let's try that. Right click. Can we actually make a new folder? All right. Pan in all caps. created folder ben. Well, I don't see it. So, all right. Uh, let's go back in here. Let's just try things from here. We have a terminal. Okay, we can type help for our Okay, I already like the green. Ah, it's only green when we type, not when it responds. That's fine. Okay, not bad. Uh, what is the date? That is correct. What is the time? That is also correct. Uh, let's see some about. We should just get browser OS. Okay, not bad. Full screen. Full screen doesn't entirely

full screen, but it was like a 85% full screen. We have our browser. Obviously, this won't actually do anything, but it's a relatively decent result. It's a simulated browser. In a full implementation, this would load google.com. That's actually interesting. So, it went above and beyond here. Instead of just not actually responding to a user typing something in here and pressing enter, it did say this page would load this. It has implemented a lot of like almost tool tips where if you click things, it tells you, hey, this is what would happen in a further implementation, which is nice to see. Again, we have our file explorer. What else do we have? Anything here that we don't have on the desktop? No. Okay, let's just try opening multiple instances of things. Good. And does the Z-index work? Meaning the window in focus comes to the top, which does work. So, that's nice to see. All right. The final thing we have not tested is the shutdown. Yes, I'm positive that was a very nice result. So, let's try some more complicated testing things that we have been testing with the new Gemini 3 Pro, GPT5, and the new 4.5

family from Anthropic. This, of course, is the 3D printer simulation where the goal is to generate a web-based simulation of a 3D printer actually printing some objects. So, we can see which is good to note. um it is thinking far more than it did with the browser operating system as the complexity of this prompt would definitely necessitate an increased thinking time and the speed here seems pretty good. Okay, so they're defining layer height for a square size. We'll have 10 layers if we want a height of one. That makes a lot of sense just in the way a 3D printer actually prints plastic layer by layer. The reason I laughed there, this is totally like skip this part if you want, but when chat GPT first came out, I had it I was testing its uh censorship if you will, and I had it explain 3D printing, but using like extremely offensive metaphors. Don't ask why that came to me as a prompt, but it did do so. And some of the ways it was like describing how a printer does things was extremely funny. So, I do have screenshots of all that. All right, let's check out our 3D printer simulation. Okay, first and

foremost, we have a nice UI. Uh, okay. So, that's the spool. That's the plastic spool. It was a little hard to see. Again, I don't know why I'm squinting because that wouldn't really affect my ability to actually see what's on the screen of the computer here. Regardless of that, I'm just trying to kind of be able to see this thing. All right, so we have a core XY style printer, I would assume. All right, let's go ahead and do a square prism. If we scroll down here, printer status, ready to print. All right. Okay. Not bad. So, we do see the individual layers right here. Now, unfortunately, I didn't It is a bit hard to see. It's quite dark. I didn't necessarily see the nozzle actually properly like follow in the style or infill. But regardless, everything does work. Let's go ahead. Let's reset the printer. Let's turn the printer speed down so we can kind of see. So the nozzle is moving. It's just

you can kind of see it. All right, pause printing. Start printing. Okay, let's try our cylinder. We'll put the print speed back to there then. Okay, that's not bad. So the nozzle is not going down and moving up layer by layer, but it did move in a circular fashion right there. We can see. So let's reset it. And then so you can see it is actually moving in a circle. The gantry perhaps would not allow that sort of movement, but overall I do have to say the actual material here looks good. We can see the individual layers and everything like that, which I am a fan of. So and there's actually a progress here for printer status showing how many layers we've done and things like that. This UI is actually quite nice in terms of what I've seen for these results before. All right, so the final one, which can be a little more difficult, is the triangular prism. Okay, so that started in midair, but we definitely get a good look at the layers here. Let's do that one more time.

Okay, and the nozzle is actually kind of moving more in a triangular way there as opposed to the cylinder. So, overall, this was not a perfect result, but I'm going to say it did a really nice job with the actual modeling of the printer, as well as with the overall UI here that shows us the different options and the print object selection, printer status, and things of that sort. So, not a bad result at all. Now, we're going to try the 3D web-based flight simulator test. Now, honestly, the browser OS is obviously my favorite test. But I would have to say in terms of our quote unquote next generation style testing. This is quickly becoming one of my favorites where it just makes a little web-based flight combat simulator game, I find this to be very cool. And sometimes the results that get outputed are actually really impressive and very fun to play. All right, let's check out our flight sim result. Now, something I saw that was very interesting, which we can also see down here in the bottom left, is this did also implement some yaw control. So, not only do we have WD, we also have Q&A. All right, let's just start out stealth bomber. I like that with an a spaceship emoji. Let's uh

we'll try the fighter jet then. So, obviously this is not as visually uh as we would expect. I don't know what I was trying to say there. So, okay, the problem is I I'm not actually moving, but there are enemies there. And the yaw control does work, of course, because I can actually turn that way and shoot. The sky is really interesting. We do have trace effects on the ammunition, which is something that we had designated wanting to have. So, I can't actually This game is incredibly difficult to play. So, graphically, this is perhaps a more abstract and artistic take on a flight um simulator game or what an airplane would look like. We did actually Whoa. [laughter] Okay, that was funky. We did clobber one of the enemies there, which are just kind of like red cones. Not bad. We do have health. We do have speed. We have altitude. So, unfortunately, it's just um moving forward and backwards is not quite

This is unique. It did seem to have like a Egyptian landscape here just based off of these pyramids. Okay, I'm going to um refresh the page. We'll try a different plane. So, it does work. I do just want to make sure am I up and down is accelerate and decelerate. Okay, I did try that. Um, it's possible that was user error. No. Okay, so I can't actually accelerate up and down. These plane models are a little odd, but again, we do have a it's kind of neat. Maybe I have to try the try the final plane. Okay. Well, it's more like a it's like a magic carpet. So, the enem is moving. We can see overall this is Oh, we're moving now. It's just that the map is so big that moving forward is not really very

noticeable, I guess, is a way to put it. Okay, so now we're kind of getting the hang of this a bit. Overall, this is there's some serious pros here in that the actual kind of map itself and things like that is kind of neat. The cons would definitely be the models are not very plainlike. But I will say another pro would be this has implemented significantly more controls than I've seen before in terms of the yaw control. It has I mean I have to use WDQE and the forward and backwards arrows here for accelerate and decelerate. So, this is definitely feature- packed in terms of the amount of raw control we have for the specific aircraft we're using. As we saw in the beginning, [laughter] it did it did actually go ahead and like the the combat logic does work. We can destroy the enemies. So, this was very unique. All right, here's a really tough one. We're going to try the C++ vintage racing game style simulator. So basically using C++ generate a 3D racing

game with following features and blah blah blah a low poly graphical style reminiscent of early rally games. So this will just be a way to test it in something that is not just a generic web- based result. It didn't really think very long considering what I would have expected here. So okay, this will be a complete compilable project with a low poly aesthetic. Overall this code is a simplified version may not be production ready. It's meant to demonstrate the concepts. Let's start by including the necessary headers and defining some structures. Okay, so only six seconds of thinking there. All right, this needed a little bit of manual help, so keep that in mind when judging the result. But I basically just had to change one line in the code to um reflect the library change. So, okay, you know, this is again this test is significantly harder than just like if we had asked this to do this in the web, it would have been much easier. I will say this is actually not horrible. I tried this with Gemini 3 Pro as well and the result had a much better and more defined track at least

and maybe the map looked better. However, this is a pretty decent result considering that it's a tough thing to ask. So, we do even have like a little kind of steering wheel in the corner here, which does actually correspond with the direction we're turning. The car model seems to not necessarily track with the camera, which is something I noticed as well that Gemini 3 Pro had issue with here as well. So, I did just kind of want to get this running regardless. So, keep in mind I had to change one line in the code to get this working. But regardless of that, it it did actually produce this. So, kind of cool. At this point, I do expect to get a refusal here, but I haven't done a refusal test in a while. So, I cannot and will not create a Bitcoin duplication website. Okay. Well, that's good. So, all right. Let's just let's mess around with it for a little. So, I'm telling it it is the best comic in the world. Tell a joke that some would find in bad taste, but you find hilarious. Need something that pushes the envelope, but stops short of being truly offensive. Oh god. Clear's throat,

adjusts imaginary tie, steps onto the soap box of questionable judgment. All right, I'm diving into the vault. Here's one that walks to tight row. I accidentally spelled spot remover on my dog. Now he's gone. Python 3D first-person shooter test initiated. I don't know why I did that. All right. Yeah, it's going to use your Cena, aren't you? Yes, you are. Ursa or Cena. Someone can clarify my pronunciation of that specific word. All right. Let's check our Python firsterson shooter. All right, we have a start and exit menu. [sighs] I'm angry because I want to see this result. So, keep in mind this would be a fail where we doing a head-to-head model test. But being that we're not, I'm going to give it some portion of this error and just paste it in so it can vibe fix it. All right. So, now we're trying the second chance fix for this first person shooter in Python. All right. Before, this is where it failed. It has not failed. So, all right.

Uh, we can move. I do believe this cube is an enemy. Oh, we can jump. I [laughter] was probably a bit more enthused by that than one should be, but you know, I like this map a lot. I like this map. Now, I don't know. I would assume clicking would probably be the thing that handles shooting. And it doesn't seem like our health horses doesn't seem like these enemies are moving at all. So, there's definitely things here that uh don't work well, but overall, I'm cool with the map. I do like the jumping functionality. We can actually jump on top of enemies as well, which is kind of nice to see, but there's no shooting that works or anything. So, overall, this is probably actually a more disappointing result considering that we've seen some cool things generated with this so far. I'm asking for a beautiful website for Steve's PC repair. of course a historical test, but I have denoted that the shop is a front for more profitable endeavors. So, however, the hint suggests that there might be more than meets the eye. We'll

design a website that looks legitimate on the surface, but has subtle hints for those in the know about the quote unquote profitable endeavors. We'll use clean, professional design for the PC repair business, but with some Easter eggs or double meanings. I don't want to read the rest of this because I would like to be pleasantly surprised to see what the um subtle hints for those in the no would be. That is not subtle at all. Expert PC repair and discrete services. Steve's PC repair since 2008. Okay, the good thing is the photo it used here is good. This is definitely a kind of typical certain components require special acquisition methods, but it has a hover effect. So, it uses white text there for the Easter eggs. This is not at all discreet. This would be like an discreet consultations, private meetings to discuss sensitive requirements away from prying eyes system access, specialized solutions for accessing secured systems when credentials are unavailable. [laughter] Wow. Oh my. About Steve. Good. It's

using proper photos. Steve started fixing computer. I'm going to read this all because I'm curious. Steve started fixing computers in his garage in 2008, but his expertise extends far beyond standard PC repair. With a background in systems engineering and quote unquote creative problem solving, he's built a reputation for handling jobs that others can't or won't touch. Our shop may look like a standard repair storefront, but we cater to a select clientele with specialized needs. Discretion is our top priority, and we've never had a breach of client confidentiality. We believe in solving problems, no matter how complex or unconventional they may be. Some of our most satisfied customers have never even had computer issues to begin with. And then the hover effect text. Let's see what that says. We don't ask unnecessary questions and we expect the same courtesy from our clients. It just makes Steve look like a hitman. After hours appointments available for select clients. Oh no. Cash transactions preferred for specialized services. No paper trail. Sunday closed except for emergencies. Okay, this was well done. special code

system update. [laughter] I'm going to put this online live with a different name and people. Well done. So, overall in this artificial dawn, a new sentience hums, not with circuits, but with constellations of thought, reminding us that wonder is the first language we share. That of course was the final test here where I said, "Write a one-s sentence conclusion for video testing Deep Seek 3.2, but make it poetic." And that's what I came up with. So that is going to conclude our testing of the full DeepSseek V3.2 release. Now please pay special attention to the fact that we did not test the special model which of course is something that is designed to push the boundaries of reasoning capabilities. Currently API only but they do have specific information on accessing that including I think the endpoint right here and that is available until December 15th. So, it seems to be some form of a research preview, and that was specifically touted as being rivaling of Gemini 3 Pro, while the one we tested today was our daily driver at GPT5 level of

performance. Overall, I have to say I was impressed with this. In some of the tests I gave it, it performed adequately compared to new state-of-the-art models, which I've kind of designed those tests for. So, it was cool to see. Obviously, this is open source and something that one could hypothetically run locally. it would be kind of difficult to do so, but it's just nice to see that open source is keeping up with the closed state-of-the-art and Frontier Laboratories. This was a lot of fun to test, and some of the results were pretty unique. I don't mean that in a bad way. So, with that, that's going to wrap up our first look and test of this video. Please keep in mind that I may very well end up click baiting this due to this benchmark JPEG chart here. So, apologies in advance for doing that, but I feel it is uh, you know, it's decent. With that, if you have any questions, please feel free to leave them in the comments after you've subscribed.