skip to content
Video cover art for Using Pomerium to Secure LLMs with Nick Taylor
Video

Using Pomerium to Secure LLMs with Nick Taylor

A practical conversation about zero trust security, running local language models, and how to safeguard internal endpoints using Pomerium

Open .md

Episode Description

Nick Taylor demos Pomerium's zero-trust security model by securing a local Ollama LLM endpoint and building a GitHub Copilot extension to interact with it.

Episode Summary

Nick Taylor joins Anthony Campolo to discuss his new developer relations role at Pomerium, an open-source zero-trust security platform, and demonstrates how it can secure locally running LLMs. The conversation begins with Nick explaining his career transition from full-time engineering to devrel after his previous team at Open Source was affected by a Linux Foundation acquisition. He then breaks down the zero-trust security model by contrasting it with traditional VPN-based perimeter security, using an airport analogy to illustrate how continuous verification works at every access point rather than granting blanket trust after a single login. The technical heart of the stream involves Nick walking through the architecture of Pomerium—its identity-aware reverse proxy operating at layer seven, its open-core business model, and the managed control plane called Pomerium Zero. He then live-demos building a GitHub Copilot extension powered by a local Ollama instance, configuring Pomerium routes and policies to secure that Ollama endpoint, and proving access control works by granting Anthony permission in real time. The stream highlights how even hobbyists or small teams can improve their security posture with minimal setup, and closes with a discussion of running open-source models locally and the practical limitations of hardware for larger models like DeepSeek R1.

Chapters

00:00:00 - Catching Up and Career Updates

Anthony welcomes Nick back to the stream, noting he was one of the earliest guests back in 2022. They exchange personal updates: Anthony shares progress on his AutoShow project, including payment systems and potential cryptocurrency sponsorship through Dash, while continuing content work for Semaphore. Nick reveals he recently left Open Source after the company was acquired by the Linux Foundation and most of the engineering team wasn't retained.

Nick describes how his network led him to Pomerium, where a former Netlify colleague recruited him. His interview journey was unconventional—he initially pursued a product engineer role and even completed a backend engineering test in Go despite not writing it daily. Ultimately, Q1 goals and ramp-up concerns redirected him toward a developer relations position, which he accepted despite having spent his entire career in full-time engineering.

00:05:46 - Entering the Security Space and DevRel Under Marketing

Nick describes the learning curve of moving into the security domain after years as an application developer. His fresh perspective proves valuable for improving Pomerium's documentation, where he's already identified gaps like Docker configuration examples that didn't actually work out of the box. He emphasizes the importance of being explicit in docs and making sure code examples function when copied directly.

The conversation shifts to the organizational placement of developer relations. Nick shares that he sits under marketing at Pomerium, which doesn't bother him—he sees it as an opportunity to learn the business side. Anthony reflects on his own devrel experiences at StepZen and Quick, arguing that devrel is inherently multidisciplinary and engineers shouldn't view marketing involvement as beneath them. Nick introduces his small but capable team and stresses that Pomerium's highly technical product requires deeply technical devrel.

00:13:20 - Zero Trust Security Explained

Nick provides a thorough explanation of Pomerium and the zero-trust security model. The product is an identity-aware proxy that continuously verifies who users are and what they're authorized to do before granting access to internal resources. He contrasts this with traditional VPN-based perimeter security, where once a user connects to the network they're implicitly trusted—a model he likens to getting past a bouncer and then having free rein inside.

Using his experience at Autodesk as an example, Nick highlights practical problems with VPNs: the IT burden of supporting tens of thousands of client installations and the security risk of lateral movement once inside the network. He explains how Pomerium operates at layer seven as a reverse proxy, enforcing policies based on identity, roles, device context, and other criteria. Anthony draws a comparison to AWS IAM, and Nick clarifies that IAM covers only the identity piece, while zero trust layers on continuous policy enforcement.

00:22:18 - The Airport Analogy and Network Layers

Nick uses an airport analogy to make zero trust tangible: just as travelers are checked at the ticket counter, security screening, the gate, and even VIP lounges, Pomerium verifies users at every access point rather than granting one-time entry. He contrasts this with VPNs, which he compares to passing a single checkpoint and then wandering freely through secure areas. The analogy resonates as a way to communicate complex infrastructure concepts to non-technical audiences.

The discussion turns to networking layers. Nick explains that Pomerium operates at layer seven, the application layer, where HTTP requests, headers, and routing decisions can be inspected and controlled. VPNs typically operate at lower layers where this granular policy enforcement isn't possible. Anthony notes that web developers essentially live at layer seven since everything below has been abstracted away, and Nick shares a concrete example from his Netlify days about header manipulation and CDN cache behavior.

00:27:00 - Pomerium Architecture and Open Core Model

Nick shares his screen and walks through Pomerium's GitHub presence and open-core business model. The core product is fully open source—anyone can fork it and self-host it—while enterprise features and a managed control plane called Pomerium Zero provide the revenue model. Bobby, the CEO, founded the project with a strong commitment to open source after a previous acquisition in the security space.

The architecture centers on a control plane for defining policies and an access plane that runs wherever the user's network lives. Pomerium Zero launched alongside a thirteen-million-dollar Series A and provides a web UI for policy management, but the actual proxy runs locally, keeping latency minimal since it sits directly in front of internal applications. Nick explains how Envoy powers the reverse proxy layer and how cached policies update near-instantly when changes are made in the control plane.

00:37:32 - Building the Ollama Copilot Extension

Nick pivots to demonstrating a GitHub Copilot extension he built that connects to a locally running Ollama instance. He explains the mechanics: a Copilot extension requires a web app registered as a GitHub app, with specific permissions for Copilot chat access and editor context. Using VS Code's built-in port forwarding, he exposes his local app to the internet so GitHub can route Copilot requests to it.

Walking through the code in debug mode, Nick shows the full request lifecycle—from Copilot chat input through payload verification using the Copilot SDK, to the actual Ollama API call with prompt engineering that incorporates file references. He demonstrates the extension responding to coding questions, though the Code Llama model's output quality is limited. The groundwork is laid for the next step: securing this Ollama endpoint with Pomerium so it can be accessed from anywhere.

00:47:25 - Configuring the GitHub App and Copilot Settings

Nick walks through the GitHub app configuration required for the Copilot extension, emphasizing security best practices like principle of least privilege. The app needs only two permissions: read-only Copilot chat access and the newer Copilot editor context permission for accessing referenced files. He shows the Copilot-specific settings where the extension type is set to "agent" and the callback URL points to the port-forwarded local server.

The walkthrough covers practical details like disabling unnecessary webhooks, setting visibility to private during development, and the option to make extensions publicly installable when ready to ship. Nick demonstrates using the @ mention syntax to invoke custom extensions in Copilot chat and shows real-time responses flowing from his local Ollama instance through the extension, establishing the baseline before adding Pomerium's security layer.

01:01:40 - Securing Ollama with Pomerium Zero

Nick switches to the Pomerium Zero dashboard and creates a route to secure his local Ollama endpoint. After configuring the route with his machine's local IP address and Ollama's port, the endpoint immediately returns a 403 forbidden response to unauthenticated requests. He then creates an access policy restricting access to his email domain and demonstrates the entire flow working—Ollama is now accessible only through Pomerium's identity verification.

A key troubleshooting moment occurs when Nick realizes Ollama binds only to localhost by default, requiring an environment variable to allow access via the machine's IP address. Anthony then tests from his own machine, initially receiving a login gate and then a forbidden response. Nick live-edits the policy to add Anthony's email, and access is granted almost instantly thanks to the cached policy invalidation. The demo powerfully illustrates how quickly zero-trust policies can be applied and modified.

01:28:35 - Open WebUI, Model Limitations, and Wrap-Up

The final segment covers securing Open WebUI, a frontend for local models, with the same Pomerium setup. Nick shows how he's applied identical routing and policy configurations to make it accessible from anywhere while remaining protected. The conversation turns to practical considerations around running open-source models locally—Anthony points out that Nick's DeepSeek R1 instance is a distilled version at around four gigabytes, while the full model requires over 600 billion parameters and roughly a terabyte of storage.

Nick delivers a final pitch for Pomerium: it uses zero-trust principles to continuously verify identity, actions, and device context, representing a significant improvement over VPN-based security. He encourages viewers to start with Pomerium Zero's free tier and Docker installation for a quick setup. Anthony expresses interest in using Pomerium for his own AutoShow project to expose local models securely, and both agree to collaborate again soon. The stream ends at approximately one hour and forty minutes.

Transcript

00:00:02 - Anthony Campolo

Alright, we are live. Welcome back to AJC and the web devs with one of my favorite people to stream with, Nicky T. What's up, man?

00:00:13 - Nick Taylor

How you doing, man? Thanks for having me on.

00:00:15 - Anthony Campolo

Yeah, doing good. You were one of the first guests on this stream, back on episode four or something, when you were working at Netlify. That was, I think, SSR and Astro and stuff, or just SSR.

00:00:28 - Nick Taylor

Okay. Yeah, it's been a minute. That was probably in 2022. So how you been? I don't think I caught up with you since the holidays.

00:00:46 - Anthony Campolo

Yeah, I'm doing good. You've kind of seen me building out AutoShow. I've done three streams on your channel about it, and it's at the point now where I'm putting together the payment system and log in and about to ship the sucker. So that's pretty sweet.

I'm still doing some content for Semaphore and getting back into the fold with Dash. I'm actually trying to get Dash to sponsor AutoShow for me to build in a cryptocurrency payment option, so that could be pretty interesting. But yeah, not really much has changed since the last time we talked, honestly.

00:01:25 - Nick Taylor

Okay, cool. Yeah.

00:01:27 - Anthony Campolo

You have a new job now?

00:01:29 - Nick Taylor

Yeah, a few things changed for me. I was working at OpenSauced, which was awesome. My team's awesome over there. We got acquired by the Linux Foundation, which is awesome, too. Unfortunately took Becca.

00:01:46 - Anthony Campolo

No one else. Is that what happened?

00:01:48 - Nick Taylor

Well, most of the engineering team didn't make it over there, which, you know, it is what it is. These things happen during acquisitions, so no bad blood or anything.

00:02:02 - Anthony Campolo

Eminently hirable though.

00:02:05 - Nick Taylor

Yeah, obviously I was a little bummed out about it, but I put the word out that I was looking for new challenges and definitely got to give a big shout out to my network because people reached out, including Pomerium where I'm at now.

One of the VPs of engineering over there is ex-Netlify, where I used to work, and just reached out. For full context, I've been full-time engineering pretty much my entire career. But every time I livestream and I have a guest on that is devrel, they're like, "Oh, I thought you were devrel." So I've said that.

00:02:46 - Anthony Campolo

I've said this a bunch of times to you.

00:02:49 - Nick Taylor

Yeah. So this was kind of a funny process. I was interviewing for full stack roles, front end roles, and entertaining devrel roles, too.

For Pomerium, it was super interesting because RJ, who's the VP of engineering over there, was like, "We're looking for a product engineer, maybe." I got to chat with him and spoke to a couple people, and then they were like, "We don't really have an interview process for that because they're predominantly back end devs." There is a front end over there as well, but they were like, "I don't really know what to give you for a test."

So I ended up doing the back end engineer exam test, which I did pretty decent at considering I don't write Go daily and the core of the product is a reverse proxy. I enjoyed it.

[00:03:54] But I knew I wouldn't finish everything. I tried to focus on the core of the interview test. That went well, and then I had some conversations, spoke to a back end engineer after talking about the test and the interview, and chatted about some other things.

Then, we've got a lot of ambitious goals for Q1, and I think RJ was like, "Might not be setting you up for the best success because we're going to be moving super fast." Even though I'm totally fine being thrown into the unknown, I think it was more like the ramp-up and stuff might have really deterred from a Q1 kind of goal, maybe.

00:04:40 - Anthony Campolo

And we'll also talk about this later, but as a security company, you can't really move fast and break things. There's a little bit less of that, I would guess.

00:04:48 - Nick Taylor

Yeah, that too. He had mentioned this kind of at the beginning, like, "Oh, we're also looking for a devrel." It was just kind of funny. It's like there's good news and bad news.

He was like, probably the back end role, because it was kind of maybe going to be a product engineer or back end-ish role. But he's like, "Yeah, that's the bad news," like I was mentioning about the Q1. They got really ambitious, in a good way, for Q1. But he's like, "There's that devrel role I talked to you about before."

So I started having a conversation with my now manager, Nikhil, and it definitely sounded interesting. It's kind of a bit out of my wheelhouse in the sense that I've traditionally been an application developer. So yeah, I did a bunch of .NET in the day and then more recently in the past eight years, it's been more like full stack JavaScript and in some cases full front end depending on the role.

[00:05:46] So I'm kind of being thrown into our main product, which essentially uses zero trust principles, which we can get into if you want. But essentially the core of it is a reverse proxy. It's the whole security gamut.

00:06:16 - Anthony Campolo

All the doors are locked for everybody. And then you start by giving individual people's individual keys for individual doors. That's how I would explain zero trust.

00:06:25 - Nick Taylor

So there's definitely a bit of a learning curve, and I've been ramping up. I'm literally only at, like I was saying before the stream, week three, though. Oh yeah, no, I'm enjoying it.

00:06:39 - Anthony Campolo

As you say, because you're also a very experienced dev who is in kind of a different space. So you're the type of person that they're going to want, like going through their docs and letting them know whether it makes sense or not, because you'll find the rough spots that someone who is really into security maybe would just kind of blow through.

00:06:58 - Nick Taylor

Yeah, that's a good point. I'm still figuring out my schedule work-wise, like is today the day I focus on docs, or is today video-creation day and stuff. But I have been going through the docs, and to be transparent, they definitely do need some work.

I want it to be that somebody, even if they already work in security or infra or platform engineering, can come in and, you know, here's the code example to start something up, it should just work.

There were things to get Pomerium running in Docker where I figured out, okay, it's not taking the config file to start Docker; it was just showing an example of the image, but just outputting the version of the software. So I got blocked right away because I was trying to set it up locally.

[00:07:56] So I just made it more explicit: you pass in the config, this is literally how it looks. And then I link to where the config is if you need to configure it and stuff, and just kind of explain it, not like I'm five, but really be explicit and make sure that things just work if you copy them or whatever.

So that's something that will just be ongoing. I think it's just because, one, there's a lot of docs. And as your product evolves, everything else evolves.

So yeah, there's stuff like that too. I do think, because I'm new there, there's a fresh set of eyes on something for sure. But also to your point, I feel like there's stuff that maybe as a back end dev or something, you might be just used to these things.

And it's not, you know, I definitely understand so far what I've been digging into, but I think just being more explicit about things is always better, or even over-communicating it in the docs is better than less docs.

[00:09:01] But anyways, that's not a knock on the team. I think anywhere I've worked, docs could always use improvements. These are things that are part of my role now.

I'm just excited about it because career-wise, I think it's interesting to branch out as well. If I end up moving on somewhere else at some point, it's not like this experience would have been a bad thing. It's definitely different because it's a security space, like you're saying, but I'm pretty excited about it.

The team's been really welcoming, and I've got a good crew on my immediate team and stuff.

[00:09:50] I guess the only thing that's different, and it's not a bad thing, is I'm under marketing. I know devrel have opinions about where they are in the org, but being under marketing doesn't bother me. It's more like I think it'll actually be helpful for me to understand a bit more about the business side of things.

So all I'd say is, yeah.

00:10:15 - Anthony Campolo

Yeah. The question was always, should devrel be in marketing or engineering or product, because you can kind of make a case for it being in a little bit of all of them. So yeah, when I did Devrel for Step Zen, it was a ten-person startup. So none of that even existed. There were really no departments.

I guess I was kind of under the person who was like the head, who would have been like the marketing department, but it wasn't a fully built out marketing department. And then a quick note, I was under product. And that was cool because, you know, the product was very meaty, so I had a lot to get into there.

And they were a type of company that didn't want to be really like marketing. They wanted to just kind of speak to engineers, really. But, you know, I think devrel people, they kind of have a chip on their shoulder about not being marketing. And I think that's the wrong way to think about it.

I think it's a multidisciplinary role that includes a chunk of marketing. And you shouldn't see that as a bad thing that you're, like, too good to do, you know?

00:11:27 - Nick Taylor

Oh, yeah. Like I said, I'm not fazed that I'm under marketing. It's more just, there's some things I need to learn, that's all.

But I've got a cool crew. We're a small company, so my team is me. And then I work with Angie, who does a lot of blog content and just managing things. And she runs most of it. So we share the socials a bit, but she's been doing a lot of stuff.

I honestly still don't know all the things she does because this is literally week three, but she's doing a lot of great work. And then it's my manager, Nikhil. So we're like a small, mighty crew.

And then I think for me, the thing, to your point, what you're saying about having been full-time engineering for quite a long time is that our product is definitely super technical, you know what I mean?

[00:12:20] It's not like, although I'm a big fan of Netlify, or if you're using something like Vercel, it's not like one-click deployment kind of stuff. It's really at layer seven in terms of your infra and you're going to be deploying this. It's not a PaaS. You're managing these things, right?

So it's just different. And because of that, you do, I think anybody, if we end up getting other devrel, they're going to have to be super technical as well.

And I don't know, I'm just finding it interesting. Like I said, week three only, but I'm just digging into stuff, asking questions because it's the classic thing when people have been working on something for a long time, they know how it works. Or like, "Oh yeah, that quirk, whatever," you know?

So I'm just hopefully asking good questions.

00:13:20 - Anthony Campolo

Let's get real simple with it. So who would use this and why? And like, what would they build with it? Yeah.

00:13:29 - Nick Taylor

Yeah. So the TLDR: the product and the company, it's called Pomerium. It's an open source project. It got started by our CEO, Bobby, who's a big fan of open source and been in the security space for a while.

Who would use this is essentially a company that wants to safeguard their internal apps. So anybody who's traditionally used a VPN, like when I worked at Autodesk, for example, huge company, if I ever worked from home, I would have to log in to my VPN so I could get access.

00:14:07 - Anthony Campolo

Edgio had a VPN, too. Yeah.

00:14:09 - Nick Taylor

Yeah, and that works. But there are problems with that, though. This is what we would term the old security model, and this is called perimeter-based security.

So essentially when you log in with a VPN, you're getting access to that network. But once you're in, it's kind of like, I don't know, it's kind of like if you made it over the wall or you got into the bar.

00:14:36 - Anthony Campolo

Guy in the city?

00:14:38 - Nick Taylor

Yeah, you're in now. Like, okay, I'm in the bar, or I made it over the wall. And then everybody's like, "Oh, yeah. Nick. Yeah, he must be legit. He's in here."

So it's really just kind of like a one-time check, and then you're considered golden. And that can be dangerous because once you're in and they assume you're trusted, you can potentially do nefarious things because you're not being constantly verified. There's a couple other problems with this too.

So like, if you're using a VPN, I'll use the example of Autodesk because I literally worked there, but there's like 60,000 employees. That means if, say, 60,000 of those people work from home, at some point they have to connect with a VPN.

And hey, what's up, [unclear]? Good to see you, my man.

00:15:27 - Anthony Campolo

Good to see you. Thanks for coming.

00:15:29 - Nick Taylor

Yeah. So the thing is, there's actually an IT burden to this as well. Picture an admin of Autodesk. What happens if one of the employees who aren't on the IT staff or aren't devs, they're just like, "My VPN isn't working. What do I do?"

You're all of a sudden supporting: are you on the right version of the VPN client? Or like there's an installation issue on Windows or Mac. There could be a ton of things. And so there's that burden of maintaining, in this case, potentially 60,000 users having issues with a client.

So there still is an IT burden for your company with something like Pomerium or zero trust principles. But the difference is, people have to contact you to get access all the time.

00:16:25 - Nick Taylor

Well, potentially, but you get provisioned roles and access. It's more like you don't have to worry about all these people who potentially have issues with a client trying to get into the network.

What happens is you have Pomerium, which is a few things. It's called an identity-aware proxy that is continually verifying who you are and what you can do. And this is literally the first thing you hit before you even hit your internal network.

So you could be like, "Okay, Nick's on call. He needs to access prod because he's on call and maybe something goes down in prod. He's going to have to look at the DB or hop into the Kubernetes prod cluster and see what's going on." It'll verify that it's me. They'll verify, like, potentially, is Nick on his work computer, because you can check your device as well.

But it could also be like, does Nick have the on-call role right now? Because you only get access to prod if you're on call. And it'll be like, okay, Nick's on call. So at that point I can get into the internal network.

And anytime I do that, literally the next request of me doing that again, it's going to do the same thing. We'll talk about latency and stuff in a second. But basically you're literally guarding your entire internal network.

It's not the same thing as like, I can actually hit the prod, but I can't log in to it. This is literally like, you can't even get to prod, right?

00:18:08 - Anthony Campolo

How similar is this to AWS? IAM Identity and Access Management. This sounds like it's similar in terms of like defining roles and being based on identity.

00:18:20 - Nick Taylor

Yeah. So, to be clear, I haven't really worked with AWS because, like I was saying, I'm more kind of just getting into this space now.

00:18:28 - Anthony Campolo

Without figuring it out, IAM is the first thing you have to do. If you do anything on the freaking network, it's like, first you gotta create a role. And it's like this whole role. Yeah.

00:18:37 - Nick Taylor

Yeah. I mean, that said, I've actually worked on a custom single sign-on, and I'm aware of claims and your identity and stuff. But in that particular case, IAM, as you said, that's verifying, are you Anthony or are you Nick? And that's one aspect of zero trust, you know.

So it's, like I said, an identity-aware proxy. The first part is when you try to access something, it's going to be like, okay, you're not logged in, so let me redirect you to the identity provider. So that could be classic stuff like Google, GitHub, Azure Active Directory, whatever it is. You know, it could be your own homegrown one. But basically that's the first part.

And I think that's the part you're referring to, which is IAM, because this is really just your identity. And yes, you might get some roles in there, but that's literally only one part of it.

[00:19:36] So zero trust, like with Pomerium, is: okay, you've been identified as so and so now. But we also have policies in place. So once you're logged in it'll go, okay, Nick's email is Nick. And let's see what policies apply to that. So it'll be like, okay, he's not on call. So he doesn't get the on-call role. And so I can't even get into the production cluster because of that.

It's not even a question of like, because in the case of a VPN, you'd already be in the network and I could go to the production cluster potentially, but I wouldn't be able to log in. But I can still actually hit the machine, you know what I mean?

This is literally because it's a reverse proxy happening at that layer seven level. So it's like if you don't get the keys to go in, you're literally not even hitting the network.

[00:20:33] Although from a user standpoint, it might look like, oh, I just can't hit prod. But from a network standpoint, there's a clear barrier. You're hitting an initial roadblock, which is the reverse proxy and the identity-aware proxy.

Hey, what's up, fuzzy? Good to see you. And again, I'm still learning about a lot of this, but that's the gist.

00:20:55 - Anthony Campolo

Kind of makes sense. Yeah, that definitely makes sense. I think when we kind of see you spin something up, it'll help make it a little more concrete, but yeah, that seems like a perfectly logical thing because, you know, everyone is so concerned about security, or some people are less concerned about security, and those are the people who get owned.

So, you know, it's just such a massive, massive space of technology and strategies, and a lot of it is based on decades of kind of different stuff. So two legends were legends? Yeah, we are.

00:21:33 - Nick Taylor

Thanks, fuzzy. I still gotta make my way to Scotland. For context, that's where fuzzy is.

And like I was saying, Pomerium or zero trust stuff is typically at, well, not typically, I think it's always at layer seven. And the reason for that is like you can actually dictate what those incoming requests are going to do.

And a VPN, for example, using perimeter-based security is at a lower layer. So it's at, I'm not positive, either layer three or layer four. But basically at that point, you can't do anything in terms of managing policies or context.

00:22:18 - Anthony Campolo

Like I'm saying here, policy-based access control is the term used to describe it. Yeah.

00:22:24 - Nick Taylor

Exactly. So there's a whole policy thing too that takes part in this, and this is the thing I was saying before. Like, okay, I can create a policy where Nick gets the on-call role because he logged in and maybe the identity provider said, "Oh, thumbs up. I've got that." I gotta turn off those reactions.

00:22:46 - Anthony Campolo

Oh my god. Yeah, I think Edge hasn't turned off automatically, which is nice. But okay, I know you said you don't know a lot about it, but IAM does give you like seven different kinds of policy stuff. So you should take a look at it just because you're probably going to get asked about it a lot in comparison to your tool. That's what I would guess.

00:23:05 - Nick Taylor

Yeah, definitely with IAM. Because I have worked, like I said, on maintaining a custom single sign-on project. So you'll get your claims when you log in and stuff and it'll say, like, you know, this is your email. These are roles you have, for sure, there.

But picture something like this getting wired up to PagerDuty. So I go on call in PagerDuty, and maybe that kicks off something that assigns me that role. So when I log in, I'm an on-call person now, and then basically there'll be that policy there that says only people for prod are in our domain or meet certain criteria.

00:23:48 - Anthony Campolo

Like a bouncer at the door with the list.

00:23:50 - Nick Taylor

It literally is. Yeah, it's a good way to think of it. I actually dropped a video short today, basically using an airport as an analogy, if you want to check that out.

But yeah, it's more than just the initial TSA. It's like you're constantly verified everywhere, right? Like you first get in. Say you don't have your boarding pass yet. You're going to go to the ticket booth and say, here's my passport, thanks for my boarding pass. Or you use the machine, whatever.

Then you're going to go through security. They're going to ask you for your boarding pass and ID again. You're going to have bags, potentially, so think of that as literally like your devices that you would use to do things. And they're going to check that, and you're either good or you're not.

You go to the gate, you can't just walk in. They're going to say, can I see your passport and your boarding pass?

So there's this literal constant verification wherever you go in secure areas of an airport. And so I think it's a really good analogy for this kind of stuff.

And you can even go further. I only had 60 seconds at the airport, but imagine like you're a frequent flier and, like, I'm gonna go crash in the VIP lounge. You can't get in there unless you have your VIP card, you know what I mean? And again, you're being verified once again.

So it's kind of good to look at stuff in the physical world because I think people can relate to it a lot better. When you start talking about identity-aware proxy, IDPs, and stuff, you might lose some people initially. Whereas if you say like, it's exactly like if you're at the airport, people might be like, "Oh, okay, I get it."

And getting back to the VPN, think of it like you get to the airport, you go past some initial security screening, they don't even check your bags, and then you can do whatever you want there.

00:25:44 - Anthony Campolo

Hop in the copilot and be like, hey, I'm in the cockpit.

00:25:49 - Nick Taylor

Yeah. You just go up to the flight attendant. Yeah, I'm Nick, I'm the pilot today. They'd be like, okay, sure.

And like, obviously that sounds like a ridiculous thing. And I think that's what helps convey that example. It would be bananas if anybody could just walk into the cockpit of a plane or go into, I don't know, baggage area, like in the back there where they load the bags.

00:26:13 - Anthony Campolo

You know, I remember what it was like before 9/11. [00:26:17] - Nick Taylor Yeah. No, I don't. Not that we need to get into that, but obviously there's a lot more security in place now.

00:26:24 - Anthony Campolo

But yeah. The airport is a really good metaphor. So my question now is, since we're going to be talking about how that comes into play: when I think of an LLM, it's either a third-party service or a model you're running on your computer. It's not necessarily a big piece of your company's infra. So how does this fit into the tool?

00:26:49 - Nick Taylor

Yeah. There are different tiers of the tool. Like I said, it's an open source product. I'll drop a link to it on GitHub here.

00:27:00 - Anthony Campolo

You can share your screen too if you want.

00:27:02 - Nick Taylor

Yeah, I will.

00:27:03 - Anthony Campolo

That works. I'll bounce around the docs or something.

00:27:05 - Nick Taylor

Yeah, cool. Give me one second here. I'm just on my personal computer because that's where I have some of this set up for today. Let me go ahead and share my screen.

00:27:19 - Anthony Campolo

Was it your idea to build out an LLM example, or did they ask you to do that?

00:27:24 - Nick Taylor

I had the idea, but the CEO, Bobby, had already done something similar with another product I was telling you about before the stream, Open WebUI, so we can look at that too. Let me go here. Okay. All right, let me open this up. Let's go to GitHub.com and then to /pomerium. I didn't mean... okay, there you can see I already improved the docs, but that's not what I meant to show. That was just autocomplete in action. But yeah, it's on GitHub. There are a few projects on here. The desktop client is an Electron app you can use as well, but I just wanted to link to the actual core project. I'll drop this in the chat, and I don't know if you want to drop it in your own chat, because I think I only have access to Twitch.

[00:28:20] Cool. So essentially, Bobby, the CEO, has been in the security space for quite a while. He actually had another company acquired. The name is escaping me, but he's a big fan of open source. So Pomerium is open source, but it's open core.

If you've never heard that term, it means the core of the product is open source. You can literally fork this and use it. Our docs have sections on how to use Pomerium open source. You can secure your stuff on your internal network at home if you're a hobbyist, but you can also use this on AWS or Azure, wherever.

And then obviously it's a company; we want to make money. So there is an enterprise tier based off the core, and there are other things we think people would pay for.

[00:29:22] So that's the enterprise.

00:29:24 - Anthony Campolo

Databases like MongoDB, Cassandra, they're all open core.

00:29:29 - Nick Taylor

Yeah, it's a pretty popular model. And the other thing is we have a new thing called Pomerium Zero. I'll show you this after we start configuring some stuff. I'll explain the architecture real quickly, to the best of my knowledge so far.

Basically, you have a control plane. The control plane is where you define all your policies and decide how things work. It's for configuring the policies, and we have a self-managed version, which is the open source version. You just write YAML files and define the policies.

But we also have a newer product; I think it launched in June when we got our 13-plus million Series A funding. Pomerium Zero is a managed control plane. What that is, is you get access to a UI where you can configure these policies.

[00:30:41] You know, it's just a nicer experience. But the control plane, when it's actually running, is not running in the cloud. It's still wherever you have it installed. So when you update your policies in Pomerium Zero or you add new routes and stuff, that will get cached, and it'll be running literally in your network.

This ties into the latency I mentioned at the beginning. Because Pomerium is literally where your network is, like your internal apps, there's practically no latency because it's on the same network. It's not like if I were to go to some cloud service, log in, and try to access one of my resources. It's going to go, "Okay, there's this cloud version of something like Pomerium," but it's going to have to go to that other server, then come back once it's said, "Okay, you're all good to go."

[00:31:40] And so there's some latency there. Basically, by having Pomerium at layer seven in front of your internal network of apps, there's practically no latency. And all these policies I was just mentioning that you define on the control plane, they just get cached. So, like, okay. Yeah.

00:32:04 - Anthony Campolo

We got some questions from Fuzzy, and then I have a question too. So open source, open core: the 2025 way of saying shareware and payware. That's funny.

00:32:13 - Nick Taylor

Looking forward to your WinRAR license, Fuzzy.

00:32:18 - Anthony Campolo

So he's asking about sidecar. If I remember correctly, a sidecar is related to Envoy and Istio and crap like that I used to hear about on podcasts all the time. I never did.

00:32:33 - Nick Taylor

So I'm still pretty fresh. Basically, Envoy is what we use as the reverse proxy.

00:32:41 - Anthony Campolo

I'm right.

00:32:42 - Nick Taylor

Yeah, as the reverse proxy. That's a big chunk, open source for the win. I'm not sure if we're using Traefik anywhere, Fuzzy, but again, I'm on day 13.5, so there's some stuff I definitely have to ramp up on.

00:33:03 - Anthony Campolo

We're talking high level here.

00:33:04 - Nick Taylor

Yeah, but essentially that's it.

00:33:07 - Anthony Campolo

Yeah, I have my question, but finish your thought.

00:33:10 - Nick Taylor

I was just going to say Envoy is a big chunk of it, handling all the reverse proxy stuff. We have our customizations, but that's kind of the core of the reverse proxy because it's a pretty battle-tested reverse proxy from what I've understood.

00:33:26 - Anthony Campolo

Yeah, yeah. It's a really popular open source library. The reason I wanted to stop you is because a couple times now you've referred to the seven layers, which would be good to define because that's something I know very little about. I know what you're alluding to, but it's super, super confusing. This is one of the things I used to see videos about when I was learning to code, and there'd be a four-hour tutorial explaining all seven layers of networking. And I'm like, what the crap? There's also different versions with five layers or something. So yeah.

00:34:04 - Nick Taylor

So again, I'm newer to this part of things. But think of layer seven as the top layer. This is when you make a web request; you're at the application level, I believe.

00:34:20 - Anthony Campolo

So the browser.

00:34:21 - Nick Taylor

Yeah. So if I say go to an API, that, from what I've understood, is layer seven. The reason we're at layer seven is because we can make all kinds of decisions before you even hit your internal network. You could say, like, okay, he hit /api again; those policy checks happen. Maybe we're modifying headers. That's in plain English. I can find a French one for you too if you want, Fuzzy. But like I was saying before, I'd have to go through all this. This is stuff I actually have to read up on.

00:35:02 - Anthony Campolo

But yeah, you don't need to define all seven layers, but conceptually it's about how close you get to the actual data going through the network, I think.

00:35:15 - Nick Taylor

Yeah. And I'm not even sure if this is actually the right thing here, but think of layer seven as literally what you would see as an app, like the actual URL. It starts there, and something lower, like layer three, like the VPN, is maybe at the IP address level. Again, I've got a gap there, so don't quote me.

00:35:40 - Anthony Campolo

Don't be confused.

00:35:41 - Nick Taylor

Don't confuse that with being web developers.

00:35:42 - Anthony Campolo

We almost exist only at the top layer because all of these things have been abstracted, and they're built on protocols that were invented by the US government in white papers 40 years ago, you know.

00:35:56 - Nick Taylor

Yeah. And I can give a concrete example. When I worked at Netlify, I was on the frameworks team. Say you have a request that comes in, and there's our CDN and stuff. You might have a cache hit or miss, but the initial request doesn't have that information. And then once the request ends up in the internal network, headers are being modified and then you get a response returned. There are manipulations you can do when you're at that level, whereas if you're at the IP level, you can still do some stuff, but at that point you can't handle a request in terms of what they were requesting or what the headers are like, at least I don't think you can. This is a weak point for me right now.

[00:36:47] So this is something I'm definitely going to ramp up on, but that's kind of my understanding.

00:36:53 - Anthony Campolo

Yeah. Cool.

00:36:54 - Nick Taylor

But yeah, good questions though. So it's open core. Literally go take a peek at it. I'm working on the docs right now. If anybody's interested in helping on the docs, I'm happy to meet up with you and chat about it. For me, this is exciting because I've been working in open source since 2020, literally being paid to. It's cool that I'm working again at another company that has open source on the brain, at least for me. I find that cool.

00:37:29 - Anthony Campolo

No, it's great. Super great.

00:37:32 - Nick Taylor

We kind of got derailed a bit, not in a bad way, but that whole tangent about all that stuff is getting back to what you were asking about, like LLMs and securing them, right?

00:37:44 - Anthony Campolo

Yeah.

00:37:44 - Nick Taylor

What's the deal here? The cool thing is I can use it as a hobbyist or a small company and still really benefit from it, or it could be a large enterprise. In my case, I created a GitHub Copilot extension and I had this template. I did a talk with the GitHub Open Source Hour in December, but after talking to you and my friend, an old coworker, John, who talks about this all the time, I was like, I got Ollama up and running on my Mac. And then I was curious: could I create a GitHub Copilot extension that talks to Ollama? So that's what I did recently.

I can go find that on Dev. I got it. Okay, cool. So I created this blog post about it, and there's a project on my GitHub, and I got it working.

[00:38:49] The thing is, we'll go through how to set up a GitHub Copilot extension. Prior to me creating my template, I had done this live stream with a couple people from GitHub when I was working in open source, and I made an open source Copilot extension. I now know how you build one, obviously, but it's a few things. There's an actual web app that's running; that's the core of a GitHub app in general.

To use Copilot, at least the universal one where you can use it on GitHub.com, Visual Studio, Visual Studio Code, etc., you have to create a GitHub app. So you have to have a deployable site for that to work. I basically use port forwarding, and we'll go show this shortly. You have this app running in GitHub as a GitHub app, and then when you make requests in Copilot, you're going to end up hitting that actual GitHub app.

[00:39:56] And that's kind of how it all works. You make requests, like you ask a question, it's going to go through the web app you made which is installed in GitHub as the GitHub app, and then you'll get answers that way. That's great.

Working locally, I use the port forwarding in VS Code so I can expose my local to the internet. Then I was like, okay, I've got Ollama running. The reason it worked in my setup was because even though my local web app, which is the GitHub app, was running locally, it was exposed to the internet, but the local web app could actually hit Ollama because it was local. And it works. I'll show you that in a minute.

So my next thought was, and this was after interviewing with him, I could probably secure my own Ollama so that it can be accessible from outside of my internal network, which is my internet connection with my router.

[00:41:03] Then I would be able to use Ollama in a Copilot extension wherever I am on the planet, and that was kind of the idea, and I haven't done that yet. So that's kind of what we were going to look at today. The other thing too is we could also look at securing something called Open WebUI, which is a pretty nice front end that allows you to use local models, and we could do the same thing.

So I guess let's go through the Copilot extension first, and then I can show you the Pomerium stuff, and then we can see if we can secure... I'm pretty confident we can secure the Ollama API endpoint. The thing I'm wondering about is I'm not sure about the authentication once it's running in the Copilot extension, but we'll see. So, with all that said, let's get busy. All right. Yeah.

00:41:57 - Anthony Campolo

I'm not super deep on Ollama because I tried out a whole bunch of different things for running local models for AutoShow. Because Ollama uses llama.cpp under the hood, so I tried that out, and I tried a node version of that, node-llama-cpp. Then I found that Ollama really had a nice level of abstraction where it simplified a lot of stuff while still giving you pretty much all the power of those tools.

00:42:25 - Nick Taylor

Yeah, cool. So I think what I'll do is I'll start running the Copilot extension and I'll explain how it works, and then I'll show you where Ollama comes in and stuff. So the first thing you want to do is you need to get the app running. It's been a minute here. Okay. I think I switched Node. Okay, that's all up to date.

All right, I'm using Node.js for this. Hono is pretty solid, so I decided to go with that. But the web app you build for a Copilot extension can literally be in any language or framework as long as it can serve requests. That's all that matters.

That said, they have a preview SDK that makes it easier to do stuff in Copilot extensions. It's currently written in JavaScript, so it only works in JavaScript and TypeScript.

[00:43:24] But this stuff is open source. You could fork it and write the same thing if you wanted. Or maybe GitHub will get to that at some point. But that's why I have it written in TypeScript, so I can leverage those things.

We'll go through all that. But the first thing to do is get your app running. I'm just going to do npm run dev. This is going to run a few things here. So Ollama is running. If we go... oh, opening up in the wrong browser. Hold on a sec. Let's go to an actual browser on my screen. This is just Ollama installed on my computer. This has nothing to do with the Copilot extension, but that's just to show that it's running the endpoint. I can't zoom in this, but I have a little Ollama guy in my menu bar up top there. Okay. And also there's a web app running on localhost 3000.

[00:44:20] So let's come over here and you can see, "Welcome to the Ollama-powered Copilot extension." So the app is running, but we're not in Copilot mode yet. The first thing I need to do is... you don't have to use VS Code for this; I'm just in VS Code, so I'm going to leverage it. They have a whole ports section in one of their panels, so you can do port forwarding.

00:44:46 - Anthony Campolo

And I did not really know this. I kind of vaguely knew that this was possible. I've never done this before.

00:44:54 - Nick Taylor

Yeah. So there's other options. I mentioned in my blog post you can use Cloudflare. You could use Ngrok.

00:45:00 - Anthony Campolo

Like, yeah. Ngrok. I'm familiar with it. I use that. Yeah.

00:45:04 - Nick Taylor

But at the end of the day, it's literally the same thing. You're just exposing a locally running app on the internet, so however you want to do it.

00:45:13 - Anthony Campolo

On the Linux server.

00:45:15 - Nick Taylor

So I'm in VS Code, and I'm just going to leverage that. They have it out of the box. The only thing to note is you can see here the visibility of it is private at the moment. You have to switch it to public. I mentioned all this in the documentation. Yeah, definitely, principle of least privilege for the win.

Visibility here. So I'm going to set it public. I can't zoom that in, unfortunately, the menu. So it's public now. Now if I go to this port... sorry, if I go to this website, you're going to see... I can't zoom in the browser bar, but it's literally that URL from Microsoft. That's up here. At this point, and actually I'll drop it in the chat if anybody wants to go hit that, if you want to go DDoS my computer, that's running.

That's my JS GitHub web app running right now that you can access, and this is required.

[00:46:15] So we're going to go to GitHub now. Let's go to your developer settings. Where is it? Let me zoom out for a sec here. I always zoom in for live streams, but there we go. Okay.

You need to create a GitHub app. I already have one configured, and the instructions to do this are in the open source project and in my blog post, so I'm not going to create a new one. I'm just going to show you what I had set for it.

Basically, security again here. Let's go to security. All right. So there's some basic information here. You give it the name, whatever you want to call it. The home page URL here doesn't have to be this because this isn't live yet, but if you had a product home page you could put that there. Essentially right now I'm just using this URL, which just takes me to the "Welcome to the Llama Copilot-powered extension" page.

[00:47:25] This here is not required for a GitHub app. That is for the Copilot extension. But when I was talking to the GitHub people on my live stream back in early December, they said you still have to fill it out right now, even though it's not used. This is something in the future, like UX-wise, we're just going to not make it required if you're doing a GitHub app for a Copilot extension.

The other things you don't need: if it's checked off, uncheck active for the webhook, and then you save your changes. The other thing to note is your permissions. We keep talking about security and stuff, so you do want to have principle of least privilege here. Only give what you absolutely need. I don't need to access repositories or org permissions. I only need to set account permissions. Let's zoom this in here.

[00:48:24] Maybe that's a little too zoomy. There's only two things I need potentially. One you absolutely need is read-only access to Copilot chat for your extension to actually work.

The other one is newer, and I use this in my extension. This is having the Copilot editor context. What this gives you is access to, like, if I drag a file into Copilot chat, I can actually access that file in my own extension. Prior to a few months ago, I don't think you could do that.

So these are literally the only two permissions I have set, and then you just save that. Let's go back to the top here, and then we'll go to the Copilot settings. Initially when I did my first stream with the GitHub people there weren't two options here; there weren't three options. There was only disabled or agent. They have a new one called Skillset. I haven't used it yet, but for our purposes right now we just set it to agent.

00:49:30 - Anthony Campolo

So confusing. They all sound like I have no idea what the relationship between those three names are, just based on hearing them.

00:49:39 - Nick Taylor

Cool, cool. And then you have a URL here, and this is the URL that I've exposed with the port forwarding. It's just the root URL, so just /. This is going to be the endpoint when you ask a question in Copilot chat for your extension, which I'll show you in a second. It's going to post to this endpoint, and that's why that's there.

You can also add some inference description. This just says what it is and what it does. Initially I didn't have anything in here, but you can put something in there. That's pretty much all you need to do in terms of configuring a GitHub app for a Copilot extension.

After that, I've already done this, but you would install the app and then it's installed. The only other thing I've done right now is you can expose a GitHub app or a Copilot extension to everybody. For the time being, I've just made it accessible only to my account because I was just testing out stuff.

[00:50:44] But if you did actually say, like, okay, I'm ready to ship this, you just allow it to be accessible to anybody. Then there would be a public page for it where people could just go and install it.

00:50:54 - Anthony Campolo

Are they all your machine then every time they use it?

00:50:58 - Nick Taylor

Well, if I were to actually deploy this, I would deploy it in some cloud service probably.

00:51:04 - Anthony Campolo

Okay, I see. Yeah.

00:51:05 - Nick Taylor

But that gets back to my point. It's only for me right now because I know it's running on my local.

00:51:12 - Anthony Campolo

So yeah.

00:51:13 - Nick Taylor

So we have everything running there now, which is great. Now let's go to... I've got to re-enable Copilot Chat because I had to disable it for work. Let's... I just gotta zoom out for a second here. Copilot Chat is here.

00:51:30 - Anthony Campolo

Your screen is massive.

00:51:34 - Nick Taylor

Yeah, I got used to it. I used to have a 27-inch monitor, and that's all I would use. I got used to switching screens and stuff. But doing web development, when you're building stuff in UI, I don't know, I just tried it and I kind of got used to it.

All right. So I'm just going to go ahead and enable Copilot again. Okay. Now let's zoom in again. All right. I think I have to restart the extension. Is that it? Should be okay now. Copilot. Okay. Yeah, there it is. I'm not sure why my shortcut isn't working, but anyways, we've got the chat here now.

So you were asking before about Copilot. You can choose the models. Again, I can't zoom in the little menu here, but you have access to Claude 3.5 Sonnet, you get GPT 4001 preview, and o1 mini.

[00:52:39] And so let's just ask Copilot something like, "What is a B-tree?" So we're not using my Copilot extension yet here. So Copilot's thinking, what can I do? And it gives you an answer. Okay, great. Awesome. I didn't specify a language. This looks like it's in Python, I think.

00:53:01 - Anthony Campolo

Yeah. It usually writes Python by default unless you tell it otherwise. Data scientists ruined it. Yeah.

00:53:09 - Nick Taylor

Cool. So, all right, we got something running there. Okay, great. I'm just wondering why the chat isn't... There we go. There's the side chat.

If you're using a Copilot extension, you have to do this. If you do @, you're going to see all these Copilot extensions I have access to. I installed a few to test out stuff originally, like this Blackbeard agent, which GitHub lets you install to try out. It basically answers like a pirate.

00:53:40 - Anthony Campolo

I was going to say, that's what I thought it was going to be. [00:53:42] - Nick Taylor Yeah. The one thing to know about using a Copilot extension is you can't do something like, hey, @Ollama Pilot or whatever, and ask a question. It's not going to use your Copilot extension. You always have to do @ at the beginning. So we have Ollama Copilot, and I'm going to say, explain, show me a Merkle tree in, I don't know, Go in Golang. So I'm going to go.

00:54:21 - Anthony Campolo

Build your own Copilot extension. Was there any that you were already using?

00:54:27 - Nick Taylor

Extensions? Not really, because I didn't know it was a thing until I actually built one. Yeah.

00:54:33 - Anthony Campolo

I didn't know it was a thing until you started talking about it.

00:54:36 - Nick Taylor

Yeah. I honestly haven't seen a lot of content around it yet. I feel like I got in early with some content on this.

But I'm going to go ahead and, just for one second here, we're going to ask a question. So watch it, because it's a demo. It's going to just conk out. But this should technically be hitting my llama on my computer right now, and you're going to see. It's explaining.

It didn't answer my question. I don't know why, but anyways, this is actually running now, so I think it's because I included the code snippet, maybe the file.

But I do want to show you. I'm going to stop the web server for a sec, and I'm going to switch to debug mode. So I'm going to use the JavaScript debug terminal, and I'm going to start it up again. And now we have the debugger running. And you can see here that this is the code running.

[00:55:44] Now it started up the app. So I had a breakpoint here when my, oh sorry, when my Copilot extension starts. I'm just going to run F5 and it's up and running.

And now I'm going to show you when I actually ask a question. So let's come up here. And this will explain the pieces of what happens in a Copilot extension. So let's do a new conversation. And I'm going to get rid of this file, or we could use that file. Let's open up another file. Maybe let's do utils. Okay. And I'm just going to say, can you improve this code?

So just to reiterate, I'm running a web app on my local machine, which is registered in a GitHub app, which is going to handle requests from the Copilot chat that I'm sending here right now. So I'm going to press enter. And all of a sudden I'm in the post here. And this is actually my Copilot extension running now. So I'll just run through a few things here. It's going to time out because I've paused it.

[00:56:53] But basically, I'll let this run. So there's a few things we can do. And I'm leveraging the Copilot SDK for extensions right now. I can verify and parse the request. These are just security checks in place. Like, is this really a valid payload? GitHub has headers and stuff that they inject. So I do a few checks. And then, for example, if there's no token for me, then I can throw an error.

But otherwise I'm just going to let it run. And then you're going to see here if I just do F10, I asked a prompt and you can see here I have the whole payload. Maybe it's better if I open it up in here. So if we do payload and then all of a sudden you can see this is everything that's coming from the Copilot UI. Well, not the UI, but from Copilot. And you can see the messages in here, and then you can see, like, my message here.

[00:57:56] Can you improve this code? You can also see references, which are like the files and stuff. So I have access to that utils file, etc. So we don't need to dwell on it too much, but I use that file as a reference.

And then I make an actual request to Ollama here. So I'm passing in to Ollama. So there's my base URL for Ollama, and this is the endpoint you hit in Ollama, /api/generate. And I'm actually choosing a model. I have a default one which should be set to, I think, Code Llama. And then I'm passing in my prompt and you can see I augmented the prompt here. There's my initial question, like, can you improve the code. And then I do some prompt engineering to say, like, if there's a file reference and this only shows up if I actually reference the file, I just say what you need to do with it.

[00:58:56] And then I just let it run. So just to reiterate, we're in Copilot. We asked a question. It hit my Copilot extension. And then in my Copilot extension it asked Ollama something. It timed out with a server error because I was debugging, but I'm going to run it again and let's just let it run. And it should answer.

Okay. It's a demo. So I probably, you know, whatever. But essentially I've been able to connect to it. Let me maybe just break the debugger. Let's try that again. Let's use utils. So let's hide this one and let's add a new file and we'll say utils. Can you improve the code in this file? This is a very terrible question to ask because it's too generic, but just for the sake of demo.

Okay. Haha. Okay, it's answering, but it's not what I want. But the main point is.

01:00:07 - Anthony Campolo

You get the idea. Text is coming back.

01:00:10 - Nick Taylor

Yeah, exactly. So we have this Ollama endpoint right now, which is over here, which was localhost 11434.

01:00:21 - Anthony Campolo

There's probably better things than Code Llama at this point because I think that's based on Llama 2, which wasn't really that great of a model compared to what we're used to now.

01:00:32 - Nick Taylor

Yeah, I have others, like if I do ollama list here, I have these ones installed right now. So, you know, including DeepSeek, but there could be a better one.

01:00:47 - Anthony Campolo

Probably give you the best output, though. Yeah.

01:00:51 - Nick Taylor

So, like, I don't have this in my extension right now, but I could make it, like, pick the model you want to use and stuff. But anyway, this does work, even though it's still a work in progress.

The main thing I wanted to do is, how can I secure this? So let's talk Pomerium now. So if I come back here, I have Ollama running on here and I'm just going to open up another tab. I'm going to go to Comm. I'm using what I mentioned before, which is called Pomerium Zero. And essentially that is the managed control plane. So I have this UI where I can create policies and routes. Once I save them, it will update where the access plane is running, like where the identity-aware proxy is actually running, which is on my local right now in front of these things.

[01:01:50] So there are a few things. We have this verify endpoint, so this ties in.

01:01:55 - Anthony Campolo

A bit more.

01:01:56 - Nick Taylor

Yeah. And I don't think I'm running it. Oh, I might be on the wrong one too. Let's see here. Home cluster. That's right. Okay, let me just make sure.

01:02:08 - Anthony Campolo

To zoom in on.

01:02:10 - Nick Taylor

Yeah. So I think there are some responsive issues on the app, but let me zoom in here. Okay.

01:02:22 - Anthony Campolo

That's good. Yeah. Cool.

01:02:24 - Nick Taylor

Yeah, that's a little too big. Okay, so we have this. Oh, this is one endpoint I made, but we're going to create a route. But I want to come over to here for a sec. Where was it? I just want to check in Docker.

Okay, I'm running Open WebUI, which is that other project I told you. But I have this compose YAML file. It's going to show my secrets in there right now, so I'm not going to show it. But basically it's a Docker Compose file that allows you to run. So I'm going to do docker compose up. So I just have an alias, due. So now I create.

01:03:02 - Anthony Campolo

I created an account actually before we did the stream, just so I could see what it was. And it starts off by giving you different options like Linux or Docker Compose to install it. And so I just did the Docker Compose. That's what I knew how to do. So that's how I spun up the initial thing as well.

01:03:18 - Nick Taylor

Yeah, exactly. So I'm logged in right now, and there's an identity provider available to you through Pomerium if you don't want to use something else or you're just getting started. So just for demo purposes, and even for my own usage right now, that's what I'm using. So I'm already logged in and you can see, like, I'm verified and there's some claims here. Let's close that. And we want to create a new route. So let's just create a custom route and I'll just call it Ollama API. And let me just.

01:03:53 - Anthony Campolo

So I saw one of the options. There was node route. Why wouldn't you do that since you're using Hono?

01:04:00 - Nick Taylor

Because right now I'm not securing the GitHub Copilot web app. I'm actually securing Ollama. Okay, that makes sense. Gotcha.

Let me just disable OnePassword for a bit, just so it's not popping up like my family stuff here. It keeps popping up names. All right, I'll just turn it off temporarily. All right. Cool. All right, let's get back to it.

So whatever. You can put a description. It's not necessary. You don't need to put a logo either. The important thing is, let's create, like, what do we want to hit for Ollama. So we'll just call it Ollama. And it's going to autocomplete with the rest of the URL, which is this URL that I have access to. It's just an auto-generated one. I don't have to worry about creating certificates or anything. I just have a wildcard certificate for this massive giraffe subdomain of Pomerium. So that's what I'm going to call the endpoint.

[01:05:04] And then this is what you actually want to go to. So we want to hit HTTP localhost. And then what is it, 11434 or 34343, I think.

01:05:18 - Anthony Campolo

Don't quote me on that though.

01:05:20 - Nick Taylor

Yeah, I'll just go check real quick. 11434.

01:05:24 - Anthony Campolo

Yeah. That's wrong.

01:05:26 - Nick Taylor

Okay. Cool. And I'm going to go ahead and save the route. So now we have this route. And this is literally, and to give a bit more context, I'm on my home network. I had to open up a port on my router. So I have port 443 open. And this is how I was able to run on my local network. If you were to deploy this in some cloud-based area, you probably wouldn't have to worry about that so much, or you would do it at that level, not at a router level like I'm doing.

So I have this.

01:06:03 - Anthony Campolo

Do you mean a router like your home internet router or router?

01:06:08 - Nick Taylor

No, sorry, the home router because right now I'm running this all locally and I want to expose my local stuff onto the internet. So I've set port forwarding for 443 on my router. And that's how I'm able to run this.

Interesting.

01:06:25 - Anthony Campolo

I wouldn't even be able to do that because where I live, we're in like a mesh network. We all have access, like one thing. So I don't even have access to my router.

01:06:35 - Nick Taylor

Gotcha, gotcha. Okay, so I've actually secured my Ollama endpoint now because if I go to it, this is a valid URL, but it's telling me it's forbidden. That's a good thing. So let's go ahead and let's go to policies and let's just create a new one.

01:06:54 - Anthony Campolo

This does seem very, very simple for a security tool. This is a pretty clean dashboard and onboarding flow. I gotta say.

01:07:03 - Nick Taylor

Thanks. Thanks. Yeah, I think there's still some improvements we can do, but that's great feedback. So let's just call this Ollama access. Like, you can literally call this whatever you want.

I'm going to make this optional because this is optional in the sense that I can apply it to certain routes. If I put in force, it'll end up being like a global policy. And that's not what we want. And I'm going to add an allow block. And I'm just going to say, and this is just like classic and-or stuff. And I'm going to say domain is, and I'm just going to put nick.ca, which is my email domain in my claims that you saw in my login access there.

So I'm going to save that policy. And now I can use this and you can see it says it's unassigned. So let's go to a route. Let's go back here to the Ollama one. And let's just go ahead and edit it. And I'm going to add a policy here now.

[01:07:58] And I'm just going to say access and I'm going to save the route.

Now if I go here, let's just double check again. So I'm logged in as nick.ca here right now. That's good. And let's come here. Okay, so it says the web server is down. So that's not the same error. So that means it's actually trying to hit my Ollama now and I have access to it, I'm pretty sure.

So it says the web server is not returning connection. And it even tells you, you kind of see stuff like this. If you've ever used Cloudflare, you'll see it's like, this is working, this is working, upstream host is not. So there's something with Ola. So let's make sure that I actually have it set correctly. Okay. Oh, local. I'm going to literally copy-paste the root URL.

01:08:59 - Anthony Campolo

Just sometimes you have to do like the 127.0.0.1 instead of localhost.

01:09:04 - Nick Taylor

Oh yeah. Yeah. Oh, I know exactly what it is. I'm not using the IP. I need the IP because remember this is a networking thing at this point. So it needs to know where to go on my home network. Like, localhost doesn't make sense when it comes in.

So I'm going to get my IP address and just give me one here. Let's go back to the routes. I have another route that already exists, so I'm just going to copy the IP address out of there. That's what it should be. And I ran into this before. This is something I constantly keep doing. I don't know what that means, but okay. So let's put this here and let's save that. Okay. And now let's go to it.

Okay. It could be cached or something maybe. Let's see here. Let me make sure that actually is working, that actual IP address. Okay. So that's me.

[01:10:13] It's not right. Why would it not? I wonder if... Oh, Ollama has to... Can you run Ollama with an IP address? Let's see here. Oh, OLLAMA_HOST environment variable. Yeah. It binds to 127. So like, how do I... Okay, so at this point, this is not an issue. This is me.

01:10:43 - Anthony Campolo

OLLAMA_HOST environment variable.

01:10:45 - Nick Taylor

Okay. Yeah. So 0.0.0.0 will allow us to access it via the IP address too. So okay, let's do that. And then I guess.

01:10:57 - Anthony Campolo

I went through a decent amount of this stuff because I had to get it working in Docker and then speak to my Node app and Whisper. So I had this whole crazy multi-Docker setup. Okay. And this is a lot of the stuff that I was hitting along the way.

01:11:15 - Nick Taylor

Okay. So if I refresh this now, okay, Ollama is running. So that's great. So via the IP address is working now. So if we come here, this should work. Okay.

So this already is pretty cool because, one, we haven't done too much to do this. I have Ollama running on my localhost. We just added that environment variable to just say, like, allow me to use Ollama with my IP address. And then if we go back into Pomerium, all we had to really do is add a route. So I chose Ollama and I basically put in my IP address with the port, and that allows me to securely access Ollama now from anywhere.

01:12:01 - Anthony Campolo

So if someone typed in that same, like if they knew your IP address, is this link live? Can anyone go to it, or is it actually kind of verifying you that way?

01:12:14 - Nick Taylor

Well, the thing is, like, my IP address that's there right now, that's not a publicly accessible internet address. That's like in my internal network here. So, like, even if you went to that, maybe your local has a similar IP address, but it wouldn't work.

So, and that's where this gets back to, at least from my understanding so far working with the product, is Pomerium is running on my localhost. This gets back to what I was saying about the latency and how it's right in front of your internal network. Obviously it's my home computer right now, but it's the same idea. So Pomerium is literally part of it is the reverse proxy where it goes.

01:12:55 - Anthony Campolo

That's the reverse proxy part. Okay. Yeah. This makes sense now.

01:12:58 - Nick Taylor

Yeah, yeah. So you can go to that URL. I'll actually drop it.

01:13:05 - Anthony Campolo

In the private chat so I can copy paste it.

01:13:08 - Nick Taylor

Yeah. Cool. I mean if anybody sees it on screen anyways they can paste it, but I'm not too stressed out about it. So go ahead to that site and you should get a 403. Yeah.

01:13:23 - Anthony Campolo

Okay.

01:13:24 - Nick Taylor

Actually, go ahead and share your screen for a second. This is great.

01:13:35 - Anthony Campolo

So first it gave me a login screen there. Yeah.

01:13:39 - Nick Taylor

Let's do that again. I'll just get you to log out again real quick and then we'll, yeah. [01:13:43] - Nick Taylor Let's do this.

01:13:47 - Nick Taylor

Okay. And then just go hit the URL again. Okay. So you're trying to access, just to make it crystal clear, my Ollama endpoint that I've exposed to the internet. You get gated by a login because it's like, who the heck is this?

And this gets back to the whole thing about thinking about the VPN. The VPN is like, once you're in the network, you're good. But actually, no, wait, forget what I was saying there. At this point, it's secured on the internet. Nobody can access this unless you're on my email domain. So if you go ahead and log in.

01:14:28 - Anthony Campolo

Pomerium is kind of doing the identity layer, basically.

01:14:32 - Nick Taylor

Yeah, yeah. And it doesn't have to, like I said, you can use GitHub, Google, your own Okta, whatever. Okay. So this is great. Can you paste in the chat what your email is that you used to sign up? Because I'm going to change the policy. And are you cool with me showing it on the screen? Yeah.

01:14:49 - Anthony Campolo

Yeah. That's fine. Yeah. It's deb@ajcweb.dev, @ajcwebdev.

01:14:56 - Nick Taylor

Okay, here I'll present. Share screen. Okay. So let's go back to, I'm in Pomerium and, like I said, if you're just using the open core, you can literally do what I'm doing here. It's just you would be doing this in a YAML file, right? And if you're using the enterprise version, there's like a souped-up version of this with lots of other goodies. But for this, it's just easier to show with Zero. And it's pretty nice that I got set up pretty quickly.

Okay, so we've got a policy already. So I'm going to go to my Ollama access policy. I'm going to go ahead and edit it. And you can see here. So domain is nick.ca. And I'm going to add, not an and, let's do an or. We could just say email. I'm not even going to do domain. I'm going to be very specific: anybody in my domain or you. Let's go ahead and save that.

[01:16:03] Now if you go back to the endpoint and refresh it, you'll probably still be denied. Oh okay. There you go. Okay, cool. So now, just to show this, I'm going to come to the root again real quick, and I'm going to edit it and I'm going to remove that policy. I'm going to save it. Try accessing it again.

01:16:29 - Nick Taylor

Yep.

01:16:29 - Nick Taylor

Boom. Yeah. All right. So this is pretty cool, I think. So let's put that policy back on and I'll just go ahead and save it. And so what's happening there is I created that. I updated the policy to add you to it.

That policy is being managed in this managed control plane, which is, think of it as the UI for now. But when I updated that policy, what it did was it invalidated the cache of that policy. It just invalidated it and then it updated that cached policy, and then that's what's running immediately after that.

And this gets back to, like I was saying, Pomerium is running in front of your literal internal network. Like I said, in my case, my home network. But the cache policy is there in place. So the latency is very minimal because you're in the same network. I'm sure there is some latency to doing this check, but it's really minimal.

[01:17:36] We're talking like milliseconds, I think. So anyways, this is just, I think, really cool. You can just do this so quickly. And a cool thing too about this is you can literally start securing anything. Say you're a small company and you're just getting started. You can already improve your security posture by just putting this in place.

Even if your app doesn't have single sign-on or anything yet, it's just maybe email login and there's nothing fancy about it, you can put this in front of it. And then you could either use our identity provider or you could bring in GitHub or whatever. But basically, all of a sudden you can start improving your security posture just by doing that.

And then you can say, like, okay, everybody has access to this stuff still, but now you're blocked right away unless you go through this first. So it's just kind of a nice way to automatically improve security even if you're not leveraging everything in it right away.

[01:18:41] So.

01:18:41 - Anthony Campolo

So could you show how to actually hit the endpoint and not just show the, oh, llama is running? Because I tried to do the API chat endpoint in my client, like a Yak Yak. It's like Insomnia. And yeah, I'm not logged in through my email, so I hit there. Same thing.

01:19:02 - Nick Taylor

Yeah. So this is the part about the copilot extension where I'm not sure how it'll play out with, like, this is secured right now. It might take a little longer than just the stream today. But let's see here. So the first thing I want to do is I'm going to come back to the copilot extension, and I think it's right at the beginning. Oh, no, I have a config. Okay.

So basically I have this base URL. I'm not going to pass in an environment variable right now. I'm just going to go and hit here, and I'm not sure what's going to happen. I think I'm just going to get a 403. There's some part of the flow where I would have to handle the login, you know what I mean?

01:19:53 - Anthony Campolo

So yeah, that's what I was thinking exactly. Yeah. Yeah.

01:19:56 - Nick Taylor

And I'm pretty sure you could probably do that in copilot extensions because I do it for, like, when I was using, because I did one for open source. It asked you to log in via whatchamacallit, Supabase with GitHub.

So let's just try this. If it doesn't work, it's all good. The main point of this right now, at least, is just to show that I've secured an internal resource. But now we're in uncharted waters, which is totally cool. That's okay. I gotta restart it. So actually, let's get out of. Yeah, we'll let it run there. Sure. Cool. And let's just let it run.

01:20:44 - Anthony Campolo

Yeah. I'm looking at the LLM guide that you had originally shared with me, and they gave you a whole Docker Compose file where you kind of include all your authentication stuff.

01:20:56 - Nick Taylor

Yeah. So all that stuff that's there, I'm running that stuff in Docker right now in Zero, and I configured the routes in there. So it's pretty much doing what's in there. I don't know if you want to drop a link to that or not, but yeah.

I think what's going to happen here is I'm going to get a 403, and then we could probably just say, like, let's do a new chat. We'll say, will this, or I'll ask a real question, how do you do a bubble sort. All right. Okay. Now let's see here. Okay. So it's still going to come into the copilot extension. Where it's probably going to die is when it tries to make the request to Ollama.

So let's go down there. Where is it? Ollama, where are you at? Where's Ollama? Okay. Oh, yeah. Here it is. There, there, there.

01:22:05 - Nick Taylor

Let's just go forward slash API. Yeah, yeah, there it is.

01:22:08 - Nick Taylor

Yeah. I'm being inefficient. All right, I'm going to go ahead. It's going to ask the question. It's hitting the endpoint as it should. This is my GitHub web app running right now. And then we're going to get to the Ollama. And this I'm pretty sure is going to fail. So it's probably going to end up in here.

Okay. Error parsing Ollama response. Sure. Why didn't it fall in there? Oh, yeah. No, that's this actual error. So yeah, why didn't it hit there? Let me do it again. But it's clearly failing because it can't hit that. It should give a 403. But why is it, where are we at? I think it's just going to die. It's going to try. I would have expected it to fail faster.

But I think what we can do is what I did with open source was like, if the response fails and, oh, you know what it is, I have to put it in a try-catch probably.

[01:23:22] Oh no, it's in a try-catch already. Oh, it's over here. It should land. Sorry. Okay, let's try this again. Okay. So this is expected. We get an error. And the error is, what is the cause? Okay. Yeah. Unexpected token. So it's JSON returning a web page, which is the 403. Like, if we look at that, it's.

01:23:51 - Anthony Campolo

Yeah, that's.

01:23:51 - Nick Taylor

What I got.

01:23:52 - Anthony Campolo

In the app. Yeah.

01:23:54 - Nick Taylor

Yeah. So what we could do is we could try this. We know that it failed. I'm just going to assume it's always the same error right now. And then we could say, I'm going to just disconnect here for a second so you can see this error here. That's a native Copilot UI error. And that's created by this helper function that's in the SDK.

So what I'm going to do is I'm going to say, I have to do a stream dot, right? No. What do I have to do? See that text? Create it. Here. Hold on a sec. Where's the stream?

Basically, what I'm going to do is I'm going to, I know it's going to fail because I'm not logged in. And then I'm going to ask in the prompt in the copilot chat. I'm going to say you need to log in to Pomerium. So that's what we're going to do. I just got, where's the stream?

[01:24:57] Oh yeah. Create an event. Create text event. That's what I want. Thank you. Cool. So let's go back to the error, which is over here. And I'm going to do this. And I'm going to say, you need to log into Pomerium. Okay. That should be good.

Okay. So let's. And then let's just forget the error handling for now. I'm just going to assume the happy error path, which is it's a 403 always. So let's go ahead and do a new chat. And let's just say, how do you do a bubble sort. And it should end up here. Okay. Really. Oh okay. You see I got you need to log into Pomerium, which is good. I forgot to do the done event. Sorry. Let's do that again. Cool.

Let's do this again. Okay. So it says you need to log into Pomerium. So what I'm going to do is I'm going to copy this URL here. This is, think of this as like my identity provider.

[01:26:14] And I'm going to come back to the Copilot extension. That should show up as a URL. So let's do it one more time. Okay. So it's going to say I need to log in now. So let's try that. I don't know.

Okay. It says I'm verified. So there's probably something else I need to do, maybe. And I'm not even sure. In this case, there's probably something I need to do to somehow get its session enabled in the copilot chat. But I did log in. I'm already logged in, so I don't think this is going to work. But yeah, it's still going to tell me. But this is what I did with open source when I was doing it.

So at this point it's more, I am logged in, but I need to get those credentials to somehow persist in the copilot chat, so that's whatever. But we can do like that.

Those docs that you shared before, if we come back to here, I already have this other one enabled, so I'm going to actually edit the policy. I'm going to remove this one. It's a bad name because it was only for Ollama, but I'm going to grant you access to it. And now if you go to this URL. I'll just drop it in the chat.

Okay. And if you want to share your screen for a second.

01:27:49 - Nick Taylor

Yeah.

01:27:50 - Nick Taylor

And we'll kind of wrap it up here because I know we're at like an hour and a half. Okay. So you have access to Open WebUI, which is running on my computer.

Now, you don't have an account right now, so you're not going to be able to log in yet. But again, I just wanted to point out that I could probably, like in that doc, configure it. But just for the sake of time, we've already seen how we could secure an endpoint, which is really what I wanted to show you there. But we could look at the docs real quick again, I guess.

01:28:35 - Anthony Campolo

Also, I saw there's a GitHub repo for an Ollama Copilot extension that has like 500 stars. That was linked on the official Ollama. So you should check that out. You might have some ideas. Yeah.

01:28:48 - Nick Taylor

Yeah, yeah, for sure. Like, at the time, I didn't know that existed. And even if one did exist, I didn't really care. It was more for me to just try it.

01:28:58 - Anthony Campolo

But someone's asking if you guys host models, so. No, because what we're doing is we're actually running the model on your machine.

01:29:09 - Nick Taylor

Yeah. And to be clear, like, we're showcasing LLMs here, but the main thing that's securing it right now is something called Pomerium. So we're not an AI company per se. But if we go to the. Yeah, if we could go to those docs. Let me see here. Oh, you're still on my screen. Okay, cool.

Yeah. So if we go to the docs. Let's go to LLM. Yeah, I'll just double check here. There's probably something I might have missed, but this we have open. Yeah. So I'm passing the identity headers. Oh, actually, let me check if I am doing that. Do open, where edit there.

01:30:02 - Anthony Campolo

Someone asked about optimized models. So with Ollama you can run whatever models essentially will fit on your machine.

01:30:11 - Nick Taylor

Okay. So we want to pass identity headers. Cool. That's good. And yeah, we were looking more at the copilot extension, but we can see here.

01:30:26 - Anthony Campolo

Yeah. So you're using Code Llama, which is a coding offshoot of Llama. But you had DeepSeek in your list of models as well. This is what I really like about Ollama: you don't have to just use Llama. It's any open source model that it can kind of interact with.

01:30:44 - Nick Taylor

Yeah. So I've already configured it in Zero like we did. I've got a policy which is giving, for example, access to my domain as well as your email. We have the route. We've already done all this. So that's good. And then boom, boom, boom. And then we set the policy.

And then I think the thing is I just need to create an account for you on my local, and it also says I might need to enable websockets, so let's go do that.

01:31:24 - Anthony Campolo

Yeah. I think you could do Ollama without streaming turned on.

01:31:30 - Nick Taylor

Where was that? That's under timeouts. Okay. Allow websockets. Save that. Cool.

Yeah. Let's just. I guess I can create you an account. I actually haven't used Open WebUI that much. I created my own account. So let me log in.

01:31:59 - Anthony Campolo

Log in on yours. Yeah, just to kind of show it.

01:32:02 - Nick Taylor

Yeah. For sure. Let me just enable 1Password again. And where did it go? Oh, yeah, I gotta go. Extensions, 1Password. Okay. All right, so let's go ahead and log in. So now I'm logged in. Let me, I gotta zoom out a bit.

So I've basically secured. I'd have to see about creating an account for you. But basically this is running on my local machine. It's been secured by, I've created a policy to only allow my domain and your email address to access this. So like, I'll literally drop this in chat and somebody tell me if they can hit this endpoint. I'm going to guess no.

01:32:57 - Anthony Campolo

Yeah. So Ollama is agnostic to the models, so you can pull different models. And when you go to Ollama, there's a whole models section.

01:33:11 - Nick Taylor

Yeah.

01:33:12 - Anthony Campolo

Um.

01:33:13 - Nick Taylor

So I'm actually not sure if you can create multiple accounts for this. I've only started using Open WebUI recently, so I set up my admin account, but I don't know if you can add additional people. I think it might be just like a one-person kind of thing because it's your local.

But regardless, I can literally be anywhere in the world now as long as it is running. Like I'm going to put this on a Raspberry Pi eventually. It's like right now it's on my laptop. But the cool thing about this is I have a secured local model running now that I can share with a family. So like, obviously we pay for other services too, but anybody in the house could use this, which is kind of nice.

01:33:55 - Anthony Campolo

This is very cool. I want to get one of these set up for me as well, especially since R1 came out because it's the first actually open source model that is as good as the paid ones. Like really? Yeah. The Llamas were decent, but. Yeah. Is it so? I think it is the full R1 unless you're using a distilled model.

01:34:21 - Nick Taylor

Okay. Anyways, I don't know. I think it's pretty cool. Obviously I work here now, but I think.

01:34:28 - Anthony Campolo

It's super cool, dude.

01:34:31 - Nick Taylor

And honestly, it doesn't take a lot of setup. So like the TLDR here is you go to pomerium.com, you create an account, you'll have access to Zero. I went with the Docker install. I ended up running Pomerium locally, but like I said, I'm going to put this on my Raspberry Pi once I get that up and running.

You create some routes like we did over here. You secure what you want to secure, and then all of a sudden I have stuff available that's on my home network anywhere, including on my local. I mean, like, obviously locally, I could just still go to the URL like the IP, but it's probably more convenient to just go to the URL here.

But it's literally super secure. And we saw how I changed the policy on the fly there, and the cache policy got invalidated and you were automatically added, and then you could access these things. So it's pretty cool, I think.

01:35:27 - Anthony Campolo

And real quick, just because someone was talking about the chat, now that I'm thinking about you, you definitely are not running the full one. Because when you showed your models in Ollama, it was about four gigs for R1, which is oh yeah, it's.

01:35:40 - Nick Taylor

Not the full.

01:35:41 - Anthony Campolo

Yeah. So that's the most distilled one. If you want to actually run the biggest R1, it's like six, it's like over 600 billion parameters and it takes like a terabyte of data.

01:35:52 - Nick Taylor

Yeah. No. And the memory, like it was kind of funny, but like yeah.

01:35:57 - Anthony Campolo

This is a really good point. And this is still the issue with even the open source models that are good. You need like a ridiculous rig to run it.

01:36:06 - Nick Taylor

Yeah, for sure. And there are people that run ridiculous rigs to just have access to this. But I guess to my point, it's more like even this is still helpful. Like, obviously we're devs, so we would want the more powerful ones like the cloud-based ones. But I still think it's pretty neat.

And you know, we can secure literally anything at this point now. So you could run, well, whatever you want, honestly. So I don't know. Like I said, I'm still pretty new to the company, but this is one of the things that kind of got me excited about it. And even though I'm kind of newer to the security space, there's a lot of cool stuff we've been doing.

01:36:53 - Anthony Campolo

Okay, we're wrapping up. You should give the high level pitch just one more time for the new viewer.

01:36:59 - Nick Taylor

Yeah. So basically, Pomerium is using the zero trust security model. You can drop a link to it. If you go to, there's a zero trust section in the docs there, Anthony. But yeah, essentially the way it works is it's always verifying who you are, what action you're trying to take, what device you're on. I mean, you can configure what you want here.

And this is a big change from traditional stuff like perimeter-based security, like a VPN, where you log on to the VPN and then you're considered golden, and then you're considered trusted on the network. Whereas with zero trust, it doesn't mean no trust. It just means you only trust certain people based on who they are, their context, you know? And the context can be many things: device, time of day. There's a bunch of things. So it's pretty neat.

[01:37:58] Yeah. If you want to check it out, it's pomerium.com, and I'd encourage you to start off with Zero. So if you just create an account, you can get set up with Zero pretty quick. It's open core like we mentioned at the beginning, so it is open source. But there's also an enterprise version and there's Pomerium Zero as well. Yeah. No worries. Thanks. Thanks for hanging.

01:38:24 - Anthony Campolo

Yeah. Not every day we get random new viewers around. So super cool.

01:38:29 - Nick Taylor

Nothing wrong with that.

01:38:30 - Anthony Campolo

Awesome, man. Well, thanks for going over all of this with me. This is something that actually I do kind of have a use case for because I would like to be able to kind of expose my own models for like AutoShow. I don't know if it's really going to be that feasible, but I kind of want to just try it to see what it would be like.

And yeah, this is one of the things I was going to have to solve, like how do I actually, once I get the llama on the internet, how do I secure the thing? Yeah, yeah, for sure.

01:38:59 - Nick Taylor

And I'm definitely going to check out that llama copilot extension because one, I wasn't aware of it, but they probably figured out the flow for login or something. But anyways.

Yeah. No, thanks again for having me on, man. It's always good hanging. And you know, we should hang out a little more often. I know we've both been pretty busy, so.

01:39:21 - Anthony Campolo

Yeah. No, I'm happy to swap streams again. Every other live shows Trump saying plane crash cause that's much better. Yeah. We stay away from politics here, so no worries on that. Yeah. Yeah. Cool, cool. You're from Canada. You don't even know. You don't even know what a president is, right?

01:39:42 - Nick Taylor

I definitely do not want to talk about politics. So I'll just leave it at that.

01:39:48 - Anthony Campolo

Cool, man. Great. Well, I think we can call it here. This is a really good stream.

01:39:54 - Nick Taylor

Cool. Thanks, man.

01:39:56 - Anthony Campolo

All right. Later, everybody.

On this pageJump to section