
AutoShow CLI Pt.2 with Nick Taylor
Anthony Campolo and Nick Taylor cover MCP security with zero trust proxies, test a key moments prompt in the AutoShow CLI, and demo image and text-to-speech.
Episode Description
Anthony Campolo and Nick Taylor discuss MCP security with zero trust proxies, test a new key moments prompt in the AutoShow CLI, and demo image and text-to-speech features.
Episode Summary
Anthony Campolo and Nick Taylor reconnect for part two of their AutoShow CLI series, starting with Nick's experiences at the MCP Dev Summit and AI World Fair, where he observed growing enterprise interest in the Model Context Protocol. Nick explains his work at Pomerium, a security company bringing zero trust architecture to MCP servers, walking through how identity-aware proxies can manage OAuth tokens more securely than storing them directly on MCP clients. The conversation shifts to hands-on testing of a new AutoShow CLI feature: a configurable "key moments" prompt that identifies short, compelling segments from video transcripts, designed to streamline Nick's workflow for creating short-form clips from longer livestreams. They test the prompt against both ChatGPT and Claude, merge the PR, and discuss how future improvements like structured outputs and ffmpeg-based clip generation could further automate the pipeline. Anthony then demos upcoming AutoShow features including image generation across three services (Black Forest Labs, DALL-E, and AWS Nova) and text-to-speech comparisons between open source tools and ElevenLabs, noting how AWS Nova's multimodal capabilities could eventually handle an entire content production workflow. They wrap up with plans for upcoming conferences, livestreams, and the AutoShow app launch.
Chapters
00:00:00 - Catching Up and Conference Recaps
Anthony and Nick kick off the stream with some moving day banter before diving into what they've been up to since their last session in early May. Nick shares his experiences attending the MCP Dev Summit and the AI World Fair in San Francisco, noting both felt like pivotal moments for the AI ecosystem this year.
The conversation touches on key figures and developments from those events, including Google releasing a new model during a keynote and Block's Angie Jones discussing how the company has gone all-in on MCP and their open source tool Goose. Anthony shares his own impressions of Goose after a recent demo stream, calling it a strong player in the MCP space with meaningful contributions to the spec itself.
00:05:44 - MCP Security and Zero Trust Architecture
Nick explains the current state of MCP tooling, from Goose's remote server support to VS Code's new stable release with full OAuth flow for remote MCP servers. He then transitions into his core work at Pomerium, walking through an article he wrote for The New Stack about securing MCP servers with zero trust proxies.
Using a Mermaid diagram from the article, Nick breaks down how OAuth tokens flow between MCP clients and servers, and why storing those tokens on the client side poses security risks. He explains how an identity-aware proxy like Pomerium intercepts those tokens and returns short-lived replacements, adding layers of protection through policy engines that can enforce time-based, device-based, and role-based access controls beyond simple authorization scopes.
00:18:14 - From Security Talk to AutoShow Testing
The discussion bridges from MCP security to practical work on the AutoShow CLI, with Nick mentioning his upcoming conference talks at Black Hat and Commit Your Code. Anthony introduces PR number nine, a new configurable "key moments" prompt designed to extract short, compelling clips from video transcripts.
Nick walks through checking out the PR using the GitHub CLI and exploring the prompt structure, which allows users to specify how many moments they want and how long each should be. They discuss how this feature directly addresses Nick's real workflow need of finding sixty-second segments for YouTube Shorts and Blue Sky posts, a task he previously handled by manually scanning entire videos.
00:31:19 - Testing Key Moments and Merging the PR
Anthony explains the existing create clips script that uses ffmpeg to split audio based on chapter timestamps, and how it could be adapted to work with the new key moments output to generate actual video clips. They troubleshoot some setup and file path issues before successfully running the prompt against Nick's livestream with guest Dougie about self-hosted AI coding assistants.
Both test the output by pasting it into ChatGPT and Claude, comparing results across models including 4o and o3. The key moments look accurate and useful, so Nick reviews the PR, applies a minor Copilot suggestion to extract magic numbers into constants, fixes an incomplete auto-suggestion, runs a quick validation command, and merges the PR with a squash merge.
00:57:44 - Image Generation and Text-to-Speech Demos
Anthony shifts to demoing upcoming AutoShow features he's been developing in a separate workspace. He shows image generation running across three services simultaneously: Black Forest Labs, DALL-E, and AWS Nova Canvas, all producing a sunset-over-mountains prompt. He explains his interest in Nova's multimodal ecosystem, which includes video generation through Nova Reel, as a potential end-to-end pipeline for content production.
The text-to-speech demo compares an open source tool called Coqui against ElevenLabs using a deliberately silly test phrase. The paid ElevenLabs model handles the nonsensical text noticeably better and runs faster, highlighting the quality gap between open source and commercial options. Anthony outlines his plan to upstream all these features into the main repo and launch the AutoShow app before mid-July, ahead of an appearance on JavaScript Jabber.
01:12:24 - Upcoming Plans and Wrap-Up
Nick and Anthony share their near-term plans as the stream winds down. Nick mentions upcoming livestreams on the dev channel covering Astro and event-driven architecture with a guest from DigitalOcean, along with his regular weekday morning streaming schedule that fits around his West Coast team's meeting cadence.
They briefly engage with chat comments from viewer Fuzzy about AWS Bedrock, stock AI images, and browser-based text-to-speech APIs. Anthony reflects on the productive session, having tested and merged the key moments feature while demoing image and audio capabilities, and they agree to reconvene in another month or two for their next collaborative stream.
Transcript
00:00:03 - Anthony Campolo
Welcome back everyone to AJC and the Web Devs AutoShow CLI Part 2 with Nicky T. What's up, Nicky? I am in moving mode, which is why I'm also in homeless mode. So okay, that's why everything looks the way it does, but I'm feeling pretty good. Just been cranking it out for like two weeks.
00:00:26 - Nick Taylor
Yeah, moving is always a pain. I mean, obviously it's nice to move to somewhere else. Hopefully where you move to is an upgrade of sorts for sure. But it's definitely a pain packing and stuff. I mean, it's one thing if you're some C-suite executive where people just come in and pack everything for you, but I'm assuming you're in the same camp as me where you typically box your own things. And then it's just like, I don't know, after you move, I mean, I haven't moved in a long time because I live in a house now, but I think we had boxes in the basement for like three years that weren't even opened, you know?
And then it's also the best time. If you're moving, if there's anything remotely you don't use, that is the best time to get rid of it.
[00:01:16] You know.
00:01:17 - Anthony Campolo
We've moved like three times within the last five years. So this was definitely the most streamlined out of all of them, because we've gotten rid of stuff incrementally along the way, which I'm a huge fan of. I like living very light. So yeah, it's good that we're in a very nice place now, so I'm a big fan.
Tell me about what's up in your life. Have you been attending conferences since we last talked? It was around the beginning of May. We did our last stream.
00:01:46 - Nick Taylor
Oh, yeah. I've definitely been on the move, in a good way. Lots going on, but if it was since May, then I went to the first MCP Dev Summit, the Model Context Protocol summit, at the end of May. Then a couple weeks later I was at the AI World Fair in San Francisco as well.
00:02:18 - Anthony Campolo
Yeah, those are the two big ones. We were both there at the same time. I think we didn't run into each other, but Ishan talked about it. He was on the stream a couple of weeks ago. Definitely give me your take on that, would you? What did you get from that conference?
00:02:30 - Nick Taylor
I thought it was really well organized. I wasn't there to give a talk, so I was working a booth for the most part, but I got to catch, I don't know if you know Nick Nisi from JS Party and stuff, but him and his coworker Zack ran a workshop using AI. So I caught that and I was able to catch some talks, but it was just a really well-run conference. And honestly, I mean, I'm sure they say it every year about AI or whatever, but honestly, I feel like going to that MCP Dev Summit plus the AI World Fair were critical moments this year in terms of the AI space. I mean, obviously there's other conferences, I know, but...
00:03:17 - Anthony Campolo
You feel like you're in the action, though.
00:03:19 - Nick Taylor
Yeah. Even Logan, I forget his last name, from Google. They released a new model at the conference in one of the keynotes. No, just lots of excitement. Yeah. Kirkpatrick. That's it. Yeah. It's funny.
00:03:37 - Nick Taylor
Kilpatrick. Yes, sir. It's funny, I was on a live stream with him when he used to be, I think, the DevRel for the Julia language, which I'm not too familiar with, but I was on a live stream for Hacktoberfest when I think I was still at Netlify, or maybe when I had just finished working at Dev.to. But that's where, well, I didn't really meet him. Obviously, you're just on a live stream, but he seems to be doing some good stuff over there at Google.
Lots of excitement around stuff. At the booth, it was interesting because we can talk about what I've been working on, but there are a lot of people for whom AI is still super fresh. It's not like people are like, okay, we're rolling this out.
[00:04:36] There were a lot of people that were just curious about a lot of things, not necessarily committing to it right away. But then you have people all in, like at the MCP Dev Summit. Angie Jones, who was there from Block, was just saying how Block is all in on MCP. Yeah. Goose. Yeah. Goose for sure. And MCP all the way. They use MCP and Goose internally, like they're dogfooding it.
00:05:07 - Anthony Campolo
So I had Rizel and Ebony on one of my more recent streams to talk about Goose specifically. And once they showed it to me, I had a big aha moment. I feel like Goose is a thing that, for me, really pulls a lot together in terms of the MCP world and does a really good job. So I feel like they're positioned in a really interesting spot, and I think they said that they even contributed to the spec, which is probably still going to be modified at this point. So it's still in progress, but that still shows that they're kind of going to be in it from the ground floor at a size and a level of investment that not a lot of other projects will be able to match.
00:05:44 - Nick Taylor
Yeah, and I think it's smart that it's open source. I don't use Goose every day, but I have it on my machine. One thing about the MCP support they have right now is you can run local MCP and you can run remote ones, like remote servers in a certain context. You still have to run it in the configuration as like npx run remote. It's like an npm package to run a remote MCP server, but it doesn't handle an authentication flow or anything right now, at least from what I saw in the latest version of Goose. I was trying to vibe code it to do it, but it's all written in Rust and I got somewhere, but honestly, that was more me just having a beer on a Friday night going, hey, let's just vibe code this. But it didn't happen.
There's a lot of stuff going on in the MCP space. Even in Visual Studio Code now they have MCP support. They had MCP support before, but now they support remote MCP servers, including the whole OAuth flow and everything. So it was in insiders for a while, but now it's in the stable release of VS Code. Pretty neat stuff.
00:07:15 - Anthony Campolo
And that's basically what your company is meant to do, is trying to add that auth flow to MCP. Am I right about that?
00:07:24 - Nick Taylor
Yeah. We're kind of bringing zero trust to the forefront. My CEO actually opened an issue in the spec about this because if you read the security best practices, there's stuff. And I wrote an article about it too that I can find for you. Sure.
Basically, in the security best practices, they say you should have a proxy. It's table stakes if you're doing a production kind of thing. It's one thing if.
00:08:04 - Anthony Campolo
It's just any other kind of server that you deploy.
00:08:08 - Nick Taylor
Yeah. So let me find it. News. Oh yeah. Here it is. Of course it's got a spicy title, but I'll share it in the private chat.
00:08:21 - Generated/demo audio
Server's gonna get hacked and you'll be wrecked. The Complete Guide 2025.
00:08:28 - Nick Taylor
The thing is, the TLDR is they've updated the spec and it adds OAuth into the spec.
00:08:37 - Anthony Campolo
But you wrote for The New Stack. That's cool.
00:08:40 - Nick Taylor
Oh, thanks, man. My second post for The New Stack. But I'm not going to go through all of it. Basically, with the agentic workloads, you don't know what they're doing. I mean, you might know to some degree, but having a proxy gives you observability because it's going through the proxy. And it runs at layer seven, the proxy meaning it's at the HTTP or application level.
00:09:06 - Anthony Campolo
Actually real quick, pull up the article on your screen and then yeah, cool. Show map to what you're talking about so we can see, if not.
00:09:15 - Nick Taylor
Yeah.
00:09:16 - Anthony Campolo
There's visuals in this but at least text. Oh, you got one good graphic it looks like.
00:09:21 - Nick Taylor
Yeah.
00:09:21 - Anthony Campolo
And I wanted you to test a PR after this anyway. So this would be a good time to transition to the screen share.
00:09:27 - Nick Taylor
Yeah.
00:09:28 - Anthony Campolo
So this is maybe [unclear].
00:09:32 - Nick Taylor
Yeah, I already zoomed it in. I don't know if that's okay. That should be okay. It's obviously like a little spicy, the title maybe, but you know who doesn't like spice, right? But yeah, so this is not a knock on MCP at all. I think they're doing a lot of great work with the spec and stuff. These are just things that we think can be improved.
00:10:02 - Anthony Campolo
Which makes sense because the people who I'm sure created the spec weren't necessarily security experts. They would have some degree of security training, like all developers do, but it wasn't created by a security company.
00:10:16 - Nick Taylor
Well, there's people on the steering committee from Okta, and then there's Masky here who works over at Microsoft, so he's like a security expert. I actually, a funny side story, but he was my product manager at Netlify for a while when I first started there. So it's just a small world.
To give some context, the Model Context Protocol only came out in, like, November of 2024. I think it was like November 24 or something. It's late in the year basically. So we're only at June 30. We're talking like seven months this has been out in the wild, kind of, you know. And I think with new stuff, especially devs, it's like, because the remote MCP servers came after, but everybody's running these local ones and you're kind of like, oh, that's my machine, you know, whatever.
[00:11:16] It's kind of like you give it admin access to everything, you know what I mean? Because you're just trying out stuff.
00:11:22 - Anthony Campolo
Vibe. Yeah.
00:11:23 - Nick Taylor
Yeah, exactly. But with these remote MCP servers, you do want to lock them in a bit more. And there's some good things about access.
00:11:32 - Anthony Campolo
To an LLM. You can wreak havoc on all sorts of stuff.
00:11:36 - Nick Taylor
Yeah. And the thing is, OAuth, you can get that authorization. And there's static scopes like you have access to, I don't know, like your Google Calendar or whatever. That's one thing. But there's a few other things. There's the flow that happens when I come down to here. Yeah. Mermaid diagram down here.
00:12:02 - Generated/demo audio
Yeah.
00:12:03 - Anthony Campolo
So zoom that in a little bit because the text is pretty small, or if you like, open. Yeah. Bump it up still. I don't know if you could.
00:12:12 - Nick Taylor
Yeah. It's doing that funny thing. When you zoom in too much, it.
00:12:16 - Anthony Campolo
Starts.
00:12:16 - Nick Taylor
Moving down, which I don't understand. But anyways.
00:12:20 - Anthony Campolo
That's good.
00:12:20 - Nick Taylor
I think this is our flow, but the thing to mention is when you go to register with an MCP client. Sorry. Yeah, we can walk through it this way. So you're a user here, and you just want to register an MCP server. So you're using some kind of MCP client. That could be VS Code at this point. It could be Goose. It could be Claude Max if you're using their cloud integrations.
But basically, you go to register the MCP server, so you put a URL in and then it does the auth and stuff. This is with Pomerium. So I'll talk about this after, but basically you go to register and then the MCP server, if you authorize with it, you're going to get a token back. And that token gets stored on the MCP client because it needs to use it to go back and forth.
[00:13:25] So imagine you have a token that gives you access to GitHub, another one to Notion. We like to think that people aren't nefarious or MCP clients aren't nefarious, but who knows? You're giving the keys to all that stuff. And I highly doubt like Claude or Goose are going to do that, but there could be weird things LLMs do too. And you have these tokens, so there's a potential security risk there. And in the spec too, I believe it mentions that the proxy should manage them, or I might be mixing the spec with what we're doing, but essentially, what we do is when you log in, like when you register an MCP server with a client, I'll come down to the one that actually has a token. Let's go down here.
00:14:26 - Generated/demo audio
More.
00:14:27 - Anthony Campolo
Lines and arrows.
00:14:28 - Nick Taylor
Yeah. That's doing the super shrink again. Okay. So typically, if you don't have an upstream service, like your MCP server isn't connecting to GitHub or something, it's just maybe an internal app. There's that part. So you can do that OAuth.
But the other thing is, if you're getting access to other services upstream, like your MCP servers for GitHub or Notion, like I said, those tokens go back to the MCP client and it's actually better to get them inside the proxy so that in the context of what happens, we know which tokens you have that you've used to register for MCP servers. In simple terms, just think of it like there's a lookup table that knows I'm Nick and I've used these tokens.
But basically what happens is when you get to Pomerium, when it's coming back from the MCP server, Pomerium actually just returns a short-lived token to the MCP client. So even if somebody steals that when it's short-lived, one, it's short-lived, and two, it's not the keys to GitHub. It's not the keys to your Notion. You know what I mean? From a security standpoint, that's a lot better. And there's some debate about who should be owning that, but I think it makes sense at the proxy level.
The other thing is, if you introduce zero trust, and this doesn't have to be Pomerium, it could be Google's identity-aware proxy. I think Cloudflare has something similar too. But like real zero trust security, what happens is literally every request is checked every time via a policy engine that you have. It's a piece of the identity-aware proxy. So in terms of setup, I'm kind of explaining zero trust here now, but zero trust, you have the identity-aware proxy at your network edge and then you have your internal apps and stuff.
[00:16:33]
And those could also be MCP servers. It's very close to where your internal stuff is, so latency-wise that's kind of negligible. But what happens is it comes in and, like, let's just say you try to access any internal service, let's just say it's an MCP server. It's going to go, oh, sorry, Anthony, you can't reach this. Your email is not in the .com domain. That's a very simple policy.
But the policies can get more complex. They can be time based. It could be like, am I on call or not, you know, based on pulling in some third party data like PagerDuty or something. And all of a sudden it's like I might only have access to certain internal resources, including MCP servers, at certain times. Or if I'm on a registered device only. So it's not just like authorization lets you do this, but it's like, should you be doing this now?
[00:17:33] So it's really meant as: the OAuth improvements they've made to the spec are good, but this is, like, and like I said, in the security best practices they mention the proxies and stuff, so it just seems to make sense. And we're already, like I said, any identity-aware proxy is running at the app layer, so layer seven on the network stack. So it just makes sense because this is all HTTP happening for MCP servers. For the remote ones, there's obviously local ones that are stdio. But that's kind of what we've been working on in a nutshell.
00:18:14 - Anthony Campolo
Have you talked to Kent about this at all? I know he was talking about auth and MCP, at least a month or two, maybe longer, ago. I saw him tweeting about it.
00:18:24 - Nick Taylor
Yeah. I haven't spoken with him directly. I saw him at the MCP Dev Summit. I just said hello briefly. I think he was pretty busy, but yeah, he's a.
00:18:34 - Anthony Campolo
Pretty busy dude. But I know it's a thing that he has already thought about and was talking about online. So you can at least find his tweets about it.
00:18:43 - Nick Taylor
Yeah. No, I know he was. I think he was talking about auth for the Epic Stack using email. I can't remember the exact flow, but this is like, I guess the other thing to mention here, in terms of zero trust, is yes, it can secure these things. But the other thing too is this is really enterprise grade. Somebody might not care if it's just like, I'm just messing around with an MCP server, you know, proof of concept.
But businesses are literally using these today, so you want something enterprise grade to secure this. So again, like I said, this definitely is something that I think is required. And it definitely enhances what the spec already does. And there has been a, like Den actually, who I mentioned, gave this article a shout out on LinkedIn, which was pretty cool.
Anyways, that's kind of where I'm at, and I've built a client for this, so like a chat app basically. And we load MCP servers and stuff, and yeah, it's been fun, you know. But to be clear, we're not an AI company. We're security software that just secures stuff in general. It just happens to work really well with MCP.
00:20:04 - Anthony Campolo
So yeah, that's interesting. I'm sure for you, it's just cool to get to work at a company that is engaging with this new tech at a high level.
00:20:15 - Nick Taylor
Yeah, it's super fun. I've been doing a lot of the engineering for this, at least for the client side of stuff. And like I mentioned to you before the live stream, we're going to be doing webinars probably once a week. Not webinars in the traditional sense, but more like, "Hey, let's build an MCP server today," or, "This is how you get your dev environment set up." And then, like, I think it's really more just like, you know, raw webinar. It's going to be more live stream but with...
00:20:45 - Anthony Campolo
Like a live workshop a little bit.
00:20:46 - Nick Taylor
Yeah. Exactly. So yeah. Yeah. And it's just fun. So yeah, lots going on. But enough about MCP. What's going on with AutoShow.
00:20:57 - Anthony Campolo
Yeah. So I've got a PR open for you to test. Let me throw the link in the chat. I think it's PR nine. Yep. I always do this thing where I'll kind of clear my browser before a stream. Okay, there it is. I'll drop it in both of the chats.
00:21:25 - Nick Taylor
All right. Cool.
00:21:26 - Anthony Campolo
So this is the key moments prompt that we're adding.
00:21:30 - Nick Taylor
Okay. Yeah.
00:21:31 - Anthony Campolo
You wanted to extract shorter clips. It's a prompt and a way to insert how many key moments you want and how long you want them to be. You want to fire that up and maybe take one of your videos that you would actually do this for work. I'd be curious to see what it gives and if it matches what you'd actually want, and if not, how we could tweak it.
00:22:05 - Nick Taylor
Yeah, for sure. I feel you. Okay, let me just close it. I've got an MCP server running right now. I'm just going to close it.
00:22:15 - Nick Taylor
Yeah.
00:22:16 - Anthony Campolo
No AI that's running infinitely on my machine.
00:22:20 - Nick Taylor
Yeah, exactly. Okay, cool. I'm pretty sure I have AutoShow CLI on my work machine, because I think we livestreamed here last time, but worst case I'll pull it down. Give me two secs here. All right. There we go. AutoShow CLI.
And what? PR number nine. Okay. Clear. Let me zoom in here. Boom. All right, G, check out good old GitHub CLI. Check out PR number nine. Oh, I must have some stuff not committed. Hold on a sec.
00:23:02 - Anthony Campolo
Yeah, whatever stuff you got is not relevant at this point. We've kind of moved.
00:23:08 - Nick Taylor
Oh, it's the staging.
00:23:11 - Anthony Campolo
Yes. It's on the staging branch. If you click the little code thing on the top right, it gives you the exact command to run that one.
00:23:18 - Nick Taylor
Oh, yeah. Exactly. I use the shorthand. I actually didn't know they showed that for the PRs. I see. Yeah, I know they have it when you clone the repo. I typically grab the GitHub CLI command, but I didn't know they showed it for PRs too. So today I learned something.
00:23:36 - Anthony Campolo
That is the second most thing I use after cloning down the repo.
00:23:42 - Nick Taylor
Oh yeah, I'm in. I use the GitHub CLI every day. It's one of my go-to tools. Okay, so let's go in here.
00:23:52 - Anthony Campolo
If you open up the prompts docs.
00:23:57 - Nick Taylor
Okay.
00:23:58 - Anthony Campolo
Markdown file.
00:24:00 - Nick Taylor
That will.
00:24:01 - Anthony Campolo
Give you all the options.
00:24:03 - Nick Taylor
I'll prompt that test or prompt options. Right.
00:24:08 - Anthony Campolo
You said just open up your file picker real quick.
00:24:13 - Nick Taylor
Yeah.
00:24:15 - Anthony Campolo
Yeah, that's actually it. Okay, you got it.
00:24:18 - Nick Taylor
Cool. Let's close this. Let's close Copilot for the moment, okay.
00:24:24 - Anthony Campolo
And so scroll down a little bit. Right there.
00:24:29 - Nick Taylor
Key takeaways.
00:24:30 - Anthony Campolo
It's right under that one. Those are the ones that already existed. The code block right under that is the key moments one. Scroll down a little bit.
00:24:39 - Nick Taylor
Oh yeah. Key moments. Sorry.
00:24:40 - Anthony Campolo
Yeah. Okay. I think this is the first prompt that's kind of configurable in this respect, or maybe not, but it gives you a couple different options that you can configure. That's cool because it gives me a base to work off of if I want to do the same thing, like if I want 5 to 10 chapters. You'll be able to select that.
00:25:05 - Nick Taylor
And quick question. So you're using an RSS feed, yes?
00:25:09 - Anthony Campolo
Or you could just change that to video. The reason why I'm doing that is because, in terms of the options I have, that one gives you a ten minute clip versus a 1 or 2 minute long clip. So it's better if you want to test, like, I want three key moments that are each a minute long, like two minutes long. I can't actually do that.
00:25:32 - Nick Taylor
Okay, gotcha. So all right, let's grab, I have to do this one too. But the first one I wanted to do is the one with Dougie. Yep. Okay. And that's the video. Boom. Okay. Does that look solid?
00:25:58 - Anthony Campolo
Yeah. And then do you want to do ChatGPT or Claude?
00:26:02 - Nick Taylor
Okay. Yeah.
00:26:04 - Anthony Campolo
Actually, what do you have? What stuff do you have in your... You don't need to open it, obviously.
00:26:11 - Nick Taylor
Yeah, I can put a... I think I have an OpenAI API key right now. I actually have to update it, but it works with the OpenAI API key, right?
00:26:23 - Anthony Campolo
Actually, let's do this. Let's take off the dash key, because then we're going to be able to get the actual prompt. Then we'll just drop that into a chat window, okay?
00:26:35 - Nick Taylor
Yeah, okay.
00:26:36 - Anthony Campolo
Because this way we can see it. It's good to kind of see the prompt first to give you an idea.
00:26:40 - Nick Taylor
Gotcha.
00:26:41 - Anthony Campolo
What it is. And this will take a bit, so while this is running you should open up the prompt sections file, or it's just called the sections file, and go to what the prompt is. We can see that here.
00:26:58 - Nick Taylor
Okay.
00:26:59 - Anthony Campolo
Yeah. It's the very last one, so scroll all the way to the bottom.
00:27:03 - Nick Taylor
Okay. So key moments. There you go. Identify the most compelling segments from the transcript with whatever the count is. The duration should be whatever.
In my case, the way I was using it right now, I was doing this manually before we did this, but I said 60s because I wanted to use these as YouTube shorts. I know YouTube shorts are longer now, but I post it to Blue Sky, which only does 60s, and it just made sense to keep it the same length.
00:27:39 - Anthony Campolo
But now you could configure it so you could do 60 or 300 or however long. Math-wise, three minutes, that would be 180.
00:27:49 - Nick Taylor
Yeah. So the main reason we started to work on this was because, for folks who might not have caught the previous stream, AutoShow is good at pulling out transcripts. I would pass in my video, get the transcript, and copy-paste the transcript into ChatGPT or Claude. I'd say, not exactly this prompt, but I'd say, find me one or two compelling moments that are no longer than 60s in the transcript.
00:28:21 - Anthony Campolo
You'd have that prompt. You kind of just copy and paste stuff and then modify it a bit for different use cases.
So this is literally the exact thing I was doing with AutoShow while I built the CLI in the first place. It was just a question of writing that specific prompt because I have all these different prompts. But what you want is slightly different, because what I have is something called chapters, which tend to be around 6 to 10 minutes long, whereas usually it's like a minute long.
This has come up when I've shown other people, like Alex from Coding Cat. He wanted to create shorts as the same kind of thing, where something much shorter, like really concise, like a minute to a minute and a half.
00:29:03 - Nick Taylor
Yeah. No. The cool thing would be, and this could be automated, but the thing is, right now I end up creating a video from this. I get the timestamp in the transcript, and then I go to the video. I edit it from that point because the transcript might say there, but I might start a few seconds before or whatever. Then I edit the video because generally it's been pretty good at finding the key moments.
At that point, it's really a starting point for me because before AutoShow I would screen through the whole video and basically rewatch it until I found some good sections. It's very time consuming, and that's why I started doing this.
[00:30:03] But it would be cool long term if I could just say, generate me three video clips no longer than 60s.
00:30:13 - Anthony Campolo
So check this out. Go back into the project and look for the create Clips file.
00:30:20 - Nick Taylor
Okay. Oh, sorry. Yeah. So that's what this is.
00:30:25 - Anthony Campolo
Yeah. So this is not actually integrated into the CLI. This is like a one-off script that I created a long time ago, and it's specifically designed to work with the chapters. When you get the chapters it'll usually have an H2 with markdown chapters and then an H3 for each chapter. What this does is it strips that out and it uses ffmpeg to split the video for those exact chapters. That is the way it currently works.
There's a couple issues with it, which is that it's only working off of the audio file after it strips the audio from the video file. So on the video file this needs to be slightly modified. It wouldn't be very challenging, but that's the first thing.
It would need to also possibly be modified if it's going to work with the key moments, because it might, instead of looking for chapters, look for the key moments so the markdown has to come out correct.
[00:31:19] I also have a refactor of this project in a couple different ways that I haven't brought back into the main repo, some of which involve structured outputs, which kind of solves this problem and ensures that the stuff you're getting back conforms to a JSON schema.
00:31:34 - Nick Taylor
Okay.
00:31:34 - Anthony Campolo
So basically what I'm saying is all the pieces are already there to do that, with a couple things that need to be fixed and glued together. Okay, gotcha. 90% of the way there.
00:31:47 - Nick Taylor
Okay, cool. Yeah. Because I think, without building a full-blown video editor, which I think is kind of out of scope of what you want to do, it would be super cool if you find the key moment. If it's just text, that's fine, but if you can grab the video for that key moment and then add, say, 20 seconds before and after just in case it doesn't exactly start there, then you have a working clip that you can edit. You don't even have to do that editing; you can just get the clip right away.
The part I'd still consider manual is, for example, we're on the live stream right now and you're on top of me in the video. Usually when I have a guest on, it depends. If you're sharing a screen, it'll be like this and it's easier to copy.
[00:32:46] But a lot of times I'll have the guest here and me here, and then I have to take two copies of the video and superimpose them so that I have, for like a 9-by-16 view, I can just show the guest and myself talking. So there's some stuff where, at least right now, that would be manual. But I think generating the clip itself... Sorry. Okay.
00:33:11 - Anthony Campolo
You need to rerun your setup command, actually, because...
00:33:14 - Nick Taylor
Something.
00:33:15 - Anthony Campolo
Something changed in terms of the project structure, I think.
00:33:18 - Nick Taylor
Okay. Yeah. Cool. We'll let that run. That shouldn't take too long. My internet's pretty fast too, so I think pulling will be fine. Yeah.
00:33:26 - Anthony Campolo
Somebody took a really long time to download the video. Every now and then it does that because, for some reason, just the way yt-dlp is downloading the thing from YouTube, it'll be really slow for whatever reason. It's kind of hard to debug. I'm not really sure it's necessarily something that could be fixed.
00:33:44 - Nick Taylor
Will it try to download the audio again or will it check that it's there?
00:33:48 - Anthony Campolo
So what you can do is run the file command or the okay option and then feed it the WAV file you already downloaded. That's okay.
00:33:58 - Nick Taylor
Yeah, okay. Gotcha. I'll wait till this finishes.
But yeah, I think if it could generate the clips, the key moments as clips, and then I just take that and edit it, that already saves me a bunch of time. Obviously it'd be nice to have this all automated, but for the clips, because I really want it to look good, I also put the logo at the bottom and stuff like that.
00:34:31 - Anthony Campolo
Yeah.
00:34:32 - Nick Taylor
I'm sure you could automate it, but...
00:34:34 - Anthony Campolo
To fully automate, what you basically are going to be able to do is automate it to the point where you have a raw clip. If you just cut exactly what you have, then you could do all your other editing, but I'm not sure exactly on a clip-by-clip basis if you'd want to make some modifications first and then have a video that you clip. I don't know what would make more sense for your workflow.
00:34:56 - Nick Taylor
Yeah, but just getting that would be good. It's the kind of thing, like I said, you can tweak how much you want to add before or after. Anyways, yeah. So this is just installing CMake and ffmpeg.
00:35:17 - Anthony Campolo
I think I'll just run this also on my machine, just in case we get hung up on yours for network reasons or something. This is the self-hosted AI coding assistant video on the stream, so let me just run this as well.
00:35:35 - Nick Taylor
Doo doo doo doo doo. I don't know if it's hot where you're at. I can't remember. Are you in Minneapolis, or what city are you in exactly?
[00:35:45] - Anthony Campolo
St. Louis.
00:35:46 - Nick Taylor
Saint Louis, okay.
00:35:48 - Anthony Campolo
It was very hot, like a week ago. It's still fairly hot, but not as hot, which is nice.
00:35:56 - Nick Taylor
Nice. All right. ffmpeg - that's hard to say. ffmpeg is taking its sweet time.
00:36:06 - Anthony Campolo
Yeah. No. Give me just one second.
00:36:10 - Anthony Campolo
Run this on my machine as well, because it was working for me. So how many key moments do you want? How long do you want them to be?
00:36:27 - Nick Taylor
You cut out. I lost you.
00:36:30 - Anthony Campolo
I'm sorry. I lost you for a second. What happened?
00:36:34 - Nick Taylor
It was my fault. This is one thing that's somewhat funny.
Because StreamYard and other services like that are in the browser, I refresh the wrong page. There should be a, I mean, it's been around forever in the browser. There's like a do you really want to leave this page? It's a built-in thing.
00:36:56 - Anthony Campolo
Yeah, it should be for sure. I've definitely done that before.
00:36:59 - Nick Taylor
Live streams for sure.
00:37:00 - Anthony Campolo
Yeah. Right before you dropped, I asked how many key moments you want and how long do you want them to be?
00:37:06 - Nick Taylor
Yeah, I'm probably gonna do 60s. I'll say three. Like, I used to just ask for one, but I thought it would be interesting to try a few just to see, you know.
Yeah. The one with Dougie, we were talking about self-hosting your AI coding assistant. So Continue.dev. It's actually where he works now.
00:37:35 - Anthony Campolo
Yeah. What is Continue.dev, do you know?
00:37:41 - Nick Taylor
Yeah, definitely AI stuff. But the TL;DR is, like, you install it as an extension in VS Code and you can prompt and stuff.
It's similar to Copilot, but it's like something you can just add to VS Code. I'm assuming you could probably add it to Cursor and Windsurf as well, just because they're based off of VS Code.
00:38:08 - Anthony Campolo
Your thing finally worked. It took forever.
00:38:11 - Nick Taylor
Oh, yeah.
00:38:13 - Anthony Campolo
No, I guess it's a whole bunch of stuff. But I think it might be sometimes when you also stream your computer, it just kind of rate limits other parts of itself.
00:38:22 - Nick Taylor
Yeah. Well, I have like an M4 Pro. That's what they gave me at work. So it's definitely got some.
00:38:32 - Anthony Campolo
It's not a processing issue. I think it's a network issue usually.
00:38:36 - Nick Taylor
Yeah. No. But if I look, I mean unless it's throttling it internally, I have a pretty fast connection. I'll just double check.
But yeah, I usually have like one gigabit up and down. So yeah. So it's pretty close.
00:38:56 - Nick Taylor
I don't know what internet costs are where you're at, but it's reasonable. Okay. Yeah. So the internet's totally fine. It's done.
All right, so let's go back up. And you said use a file this time, right?
00:39:12 - Anthony Campolo
Yeah. And do this. You have to give it, basically. So just do content/, and then you're going to want to grab it. So actually you could just do the copy relative.
00:39:26 - Nick Taylor
Yeah. Relative path.
00:39:28 - Anthony Campolo
Yeah, I always forget which ones, but I recognize it when I see it.
00:39:34 - Nick Taylor
Is it the WAV file? Right.
00:39:37 - Anthony Campolo
Yeah. That's the one.
00:39:38 - Nick Taylor
Okay, cool. Copy relative path. Does it have to be in quotes?
00:39:44 - Anthony Campolo
Yes. Or I'm not actually sure.
00:39:48 - Nick Taylor
Okay. Well, I'll just do it in quotes for now. We've already got the WAV file, so we're going to do that. We're going to prompt key moments. Let's go.
00:40:06 - Anthony Campolo
Got it on my machine, so I think I might just switch over. Okay.
00:40:09 - Nick Taylor
It said okay. Tried to copy it, I guess.
00:40:13 - Anthony Campolo
No such file.
00:40:14 - Nick Taylor split
Or directory.
00:40:17 - Nick Taylor
It created a copy of it. Okay.
00:40:19 - Anthony Campolo
Interesting.
00:40:22 - Nick Taylor split
Try.
00:40:24 - Anthony Campolo
Yeah.
00:40:25 - Nick Taylor
Okay.
00:40:27 - Anthony Campolo
Yeah. There's something weird in the download audio. Sometimes it gets confused. Just delete all the WAV files you have and rerun the original command you ran.
00:40:39 - Nick Taylor
Okay, cool. I'm just going to delete everything. Boom. Okay. Trash, baby. Trash. All right. Cool. Test video. Cool.
00:40:49 - Anthony Campolo
Yeah. Let's just switch screens and we'll look at mine. So just to throw this guy into Claude. So here is the prompt.
So this is the base that you get every time. This is a transcript with timestamps. Okay. And then it says identify the transcript. So it was three.
00:41:19 - Nick Taylor
But actually, that's going pretty fast. It's already on step three. It's 10% of the transcription.
00:41:24 - Anthony Campolo
It worked this time. Okay.
00:41:28 - Nick Taylor
I don't know. Maybe something else was clogging it up.
00:41:32 - Anthony Campolo
There's ghosts. Always ghosts in the machine. I find when you're working with something like that, always interfacing with a YouTube service through it, which is really a tool that I'm sure YouTube doesn't want to exist, you know?
00:41:46 - Nick Taylor
Yeah.
00:41:49 - Anthony Campolo
Okay. So it says identify most compelling segments. Each segment should be approximately 60s long. And look for particularly insightful explanations.
And then this is the important part because this is what it kind of gives you back. And this is the thing that having structured outputs will fix. It won't just need to dump a chunk of markdown, it'll be more specific. But even this way, the create clips command would still work because it still has the same markdown structure.
So basically what it would be able to do is identify these exact timestamps and clip them. But like I said, it's the audio, not the video. But you just need to make that modification that can clip the original video right there and maybe give it like a 15-second buffer or something if you don't want to clip it slightly too soon or slightly too late, you know, or maybe a 32-second buffer or something like that. Yeah.
00:42:38 - Nick Taylor
Yeah, exactly. And obviously those could be optional and you could set them. But yeah, that already would just save me a ton of work.
I mean, I know there's video editing tools like I think what I, okay, it's all done, actually. It just finished.
00:42:59 - Anthony Campolo
Cool. You should dump yours in ChatGPT. Since I'm doing Claude, we can kind of compare the differences.
00:43:04 - Nick Taylor
Okay. So prompt. All right.
00:43:11 - Anthony Campolo
Brian uses a brilliant analogy to explain why customizable AI assistants matter, comparing it to how navigation apps create traffic patterns, highlighting the risk of everyone coding identically with standardized AI tools. All right. Because if all the traffic apps send you the same way, it creates bottlenecks.
00:43:31 - Nick Taylor
Okay.
00:43:32 - Anthony Campolo
Is that what you're talking about? I don't know.
00:43:36 - Nick Taylor
Okay. So this has it. This is it.
00:43:44 - Anthony Campolo
[unclear]. It will look better. It will let me do this.
00:43:50 - Nick Taylor
Yeah. I'm laughing because Dougie ended the live stream with "See you in the next one. Stay saucy."
00:44:01 - Anthony Campolo
That's awesome. Okay.
00:44:04 - Nick Taylor
All right.
00:44:05 - Anthony Campolo
Yeah. So it gives you the exact transcript. So this would be.
00:44:11 - Nick Taylor
I've got ChatGPT thinking it's refining transcript segments. It's doing some work.
00:44:20 - Anthony Campolo
Yeah, it looks like it cleans up. It basically gives it exactly as is.
00:44:26 - Nick Taylor
All right. So, thoughts for key moments.
00:44:31 - Anthony Campolo
Go back to you.
00:44:33 - Nick Taylor
Okay. So yeah. Why it matters in just a minute: Brian walks through how Continue's block system lets you choose different local model sizes and layer on custom rules drawn from real documentation, showing exactly how developers can fine-tune an AI assistant to suit workload and hardware limits. Okay, that's pretty neat.
Okay, so that's the one minute there then. Okay. How many does it generate by default? Is it three? Yeah. So it did three key moments. Yeah. And so essentially at this point the video gets downloaded, right? As well.
00:45:12 - Anthony Campolo
So what it does is it downloads the video and then strips the audio out into a WAV file, because the WAV file has to be set up a specific kind of way for it to work with Whisper. In some of the refactors I've been working on, it saves all of the files, so it will save the video file and the audio file. And that's eventually what's going to end up being, because then you could do things like edit the video directly, which is something we were talking about wanting to do.
So that's something. There's a couple things that I'd like to push upstream still, but yeah, that'll be a thing. Because what it does right now is it does download the whole audio-video file. It just deletes it after it creates the WAV file.
00:45:53 - Nick Taylor
Yeah, but what it generated looks pretty decent. So I'd say it's probably good to merge.
00:45:59 - Anthony Campolo
This is just with 4o.
00:46:02 - Nick Taylor
Yeah.
00:46:05 - Anthony Campolo
Yeah, I would try it with O3.
00:46:09 - Nick Taylor
Okay.
00:46:12 - Anthony Campolo
It probably won't look that different, but it will still be useful.
00:46:17 - Nick Taylor
Just get it to do some deep thinking. I would just open a new one. I'm going to be lazy and send it again.
00:46:27 - Generated/demo audio
You could use O3.
00:46:29 - Anthony Campolo
Oh, actually there's a button on the output where you can rerun it with a different model.
00:46:34 - Nick Taylor
Oh, okay.
00:46:35 - Anthony Campolo
Yeah. It's like the recycle thing. Not quite.
00:46:43 - Nick Taylor
That could be another interesting thing, where you pull out the key moments or any of the features you have in the CLI and run it against a few models and just, like, "Give me the best," you know.
00:46:57 - Anthony Campolo
Yeah, that's the thing that I had an issue open about a long time ago, to be able to give it multiple LLMs to run and compare to each other. It's a pretty simple thing to do, but I don't think I ever actually implemented.
00:47:18 - Nick Taylor
Okay, so it picked, I mean, I guess it's normal. It picked different key moments. Different model.
00:47:24 - Anthony Campolo
Similar. They're almost overlapping in terms of the timestamps.
00:47:30 - Nick Taylor
Okay.
00:47:31 - Anthony Campolo
That's basically the same thing it grabbed. This has the same starting time. Slightly longer end time.
00:47:38 - Nick Taylor
Yeah. Okay, so.
00:47:45 - Anthony Campolo
The transcripts are slightly different though. So one of them is rewriting something.
00:47:50 - Nick Taylor
Yeah. I don't know what happened there, but here's why it matters. The exchange zeroes in on how Continue differs from Cursor and VS Code agent mode, highlighting its open source nature, air-gap deployments, and why it matters for onboarding and governance. That could be interesting. But anyways, the TL;DR is the feature works.
00:48:15 - Anthony Campolo
Yeah, it's giving you the thing.
00:48:17 - Nick Taylor
Yeah, yeah. This is already super helpful. So I can do a quick review. Did Copilot review it for you yet?
00:48:27 - Anthony Campolo
No. Yeah. On that sucker, I asked Copilot to fix my sins.
00:48:33 - Nick Taylor
I find this is actually a good use of Copilot because there's some projects where it's just me. So although it can't approve the PR, you know.
00:48:48 - Anthony Campolo
Little.
00:48:48 - Nick Taylor
Buddy.
00:48:49 - Anthony Campolo
Little buddy, just in case. But it's maybe a very smart buddy with access to all of human knowledge.
00:48:56 - Nick Taylor
Yeah, it's generally pretty decent from what I've seen so far. Some stuff was nitpicks, but there was more to it because I basically made the demo MCP client just for the AI World Fair. I was vibe coding it.
I mean, it's still reviewing it. It's a React app. It's a TanStack Start app. And I've done a lot of React, so I was able to review stuff. But there's some stuff because I was moving super fast, like Copilot's like, "Oh, you forgot this edge case."
Okay. Yeah. So that's just doing the regular. It just gives a summary. Yeah. Okay. So you can see here.
00:49:44 - Nick Taylor
Yeah. So it's just saying use a constant. It's little things like this, so we can commit the suggestion.
00:49:53 - Anthony Campolo
Hopefully it doesn't break anything.
00:49:55 - Nick Taylor
Yeah, yeah. And then let's see if there's anything else. I think that's all I really have.
00:50:03 - Anthony Campolo
Like, I don't really have tests right now that run on the PR, so that's one thing that still needs work. It's hard to test the CLI. To fully test, you have to actually download a video and run through the whole process.
But I have test files. They just don't run automatically on PRs.
00:50:21 - Nick Taylor
You could. I wonder, yeah, it'd have to be like a CLI test.
00:50:27 - Anthony Campolo
So I'm sure I could build an agent or an action that would do it. It probably wouldn't even take that long.
00:50:39 - Nick Taylor
I'm wondering, where did it actually put that constant?
00:50:44 - Anthony Campolo
Probably in the select prompts.
00:50:47 - Nick Taylor
Okay. Select prompt. Okay. Yeah. Yep. Right there.
00:50:56 - Anthony Campolo
Line 65.
00:50:58 - Nick Taylor
Yeah. So does that actually?
00:51:01 - Anthony Campolo
It only did it for the key moments count, not for the moment duration. So okay, it's going to break two numbers.
00:51:08 - Nick Taylor
Yeah, it's gonna break because it said use it, but it didn't actually create the constant. So let me do a bit of cleanup real quick here.
00:51:19 - Anthony Campolo
Yeah. This is one of the reasons why I do this. Anytime I'm doing this kind of thing, I need to be able to just run the commands over and over and over again. Every single time I make a change, I just run a quick command to check it, make sure it works, and it takes like 10 to 15 seconds if I'm doing it on a video that's only a minute long.
So that's just. Yeah. And I have the test files, which will run a whole bunch of commands kind of in sequence if you want to do like a thorough check. But usually I can just run one. If I'm building a feature, it'll be something specific, like this one command needs to work. So I just test that one command.
And it does create this issue when you're trying to change stuff at the layer that Copilot is doing right now because it has no way of running these commands and actually knowing what the hell it just did and whether it works or not.
00:52:09 - Nick Taylor
See? And now it's doing it. It's putting the comment below it. But the other thing, we can clean up the other stuff. So let's see if it knows. Agent mode in VS Code has gotten a lot better in Copilot, actually, so let's see if it defaults.
Come on. Oh, wait. Am I not in? Oh, I'm not in agent mode. Well, it should still ask me over there, but let me switch this to agent mode. But this should, like I would have expected, work. It goes, oh, you know, key. I'm waiting for it to kick in. Moments. Okay. It's not gonna guess it. Oh, there. Now it did it.
Okay, so now let's go up here, and it should. There we go. And then this. Yeah. Okay. In this, we don't. That's good enough. Okay. All right, so there.
[00:53:24] Are there any other things? I mean, this is obviously like minor stuff.
00:53:28 - Anthony Campolo
Configurable with the command. So yeah. Shouldn't mess with anything else within the scope of this PR.
00:53:48 - Nick Taylor
Cool. Let me just double check. I didn't break anything else before. We should go down here.
00:53:57 - Anthony Campolo
And just run one of the quick commands to test. You shouldn't run it on the same thing you did because that was a long video, but you should run it on just the RSS example I have.
00:54:08 - Nick Taylor
Okay.
00:54:09 - Anthony Campolo
So, Fuzzy.
00:54:11 - Anthony Campolo
You haven't missed what we're going to talk about real quick, Fuzzy, which is the new CLI. Right now we're just adding a prompt, which is a little more minutiae anyway.
00:54:21 - Nick Taylor
Yeah.
00:54:22 - Anthony Campolo
Go down a bit right there. Yeah, yeah. You should test the last one, not the first one.
00:54:30 - Nick Taylor
Okay.
00:54:30 - Anthony Campolo
The last one actually does everything. But don't copy it after Claude. Go before that.
00:54:36 - Nick Taylor
Yeah, yeah.
00:54:37 - Anthony Campolo
Like that. You got it.
00:54:38 - Nick Taylor
Yeah, buddy. All right. All right.
00:54:43 - Anthony Campolo
Yeah. You see, right there in the options. There are two that are 60s because you're specifically configuring one. One is the default. One is not. So it's a good one.
00:54:52 - Nick Taylor
Oh yeah.
00:54:52 - Anthony Campolo
Yeah.
00:54:53 - Nick Taylor
Okay. Well, we can see that it's not going to break now, but let's just do it without the other default. Oh, actually. Yeah. There. Let's get rid of both.
00:55:07 - Nick Taylor split
Okay.
[00:55:13] - Nick Taylor
All right. Did it. Okay. So it didn't pass any in, and we should see it.
00:55:23 - Anthony Campolo
And also we didn't have this last time. You see there's progress, not a bar, but the percentage as it's running transcription, which is nice. We didn't have that before.
00:55:32 - Nick Taylor
Okay. So here.
00:55:35 - Anthony Campolo
Yep. Okay, cool.
00:55:36 - Nick Taylor
Okay. Looks like it's good, I think.
00:55:39 - Anthony Campolo
Yeah. Three by default, 60s long. Yeah. Cool.
00:55:43 - Nick Taylor
Cool. All right, let's push. Yeah, right. Clear. Cool. And refresh that bad boy. All right, I'm gonna give it a ship-it.
00:56:02 - Generated/demo audio
Ship.
00:56:06 - Nick Taylor
Cool. Nice. All right. Squash merge.
00:56:13 - Anthony Campolo
Man, I wish it had an API. You could just generate, like, a rap song for every PR. It would be so sweet.
00:56:24 - Nick Taylor
Nice. Cool. All right. So that's good.
00:56:27 - Anthony Campolo
Good. Fuzzy. We've been very good. We're building in a specific new prompt in the AutoShow CLI for Nick's use case for work, which is tight.
00:56:40 - Nick Taylor
Yeah.
00:56:42 - Anthony Campolo
Getting some good dog food.
00:56:45 - Nick Taylor
Yeah, like I said, I think you were surprised I started using it for work.
00:56:51 - Anthony Campolo
Well, I was hoping people would use it. I figured people would use it more for the personal stuff, though. Actually, no, that was just me using it and showing it to people who would use it for more personal stuff. But obviously the whole time I want anyone to use it.
00:57:05 - Nick Taylor
Yeah.
00:57:06 - Anthony Campolo
There are people I've talked to who aren't even developers at all who are using it for different purposes, like therapists who do three-hour-long workshops with very long chunks of content similar to a live stream, almost like this, big and raw. Yeah, exactly. Just video stuff.
But yeah, I was just happy that you were actually using it for a work thing because I figured you would get some use out of it. You're one of the people I know who have a type of workflow similar to mine, create a similar type of content, and face similar types of problems and use cases. So yeah, it's great.
00:57:44 - Nick Taylor
Yeah. Cool, cool, cool. So I guess, yeah. So that's merged.
00:57:50 - Anthony Campolo
Okay. I will test that later off stream. Since you've only got 20 minutes left, I'm gonna go into the new, new. I'm gonna close that.
00:58:00 - Nick Taylor
I'm moving Streamyard onto my main screen since I'm not sharing. So I'm not turning my neck for the next 20 minutes for sure.
00:58:08 - Anthony Campolo
Cool. Okay, so we got it. Now, this isn't actually in the CLI repo yet because I basically wanted to be able to work on stuff kind of separate from the repo. The repo is public, and sometimes I just want to create all sorts of weird stuff that could break things, or test content that I don't want to be hard-coded and stuff.
But I have basically created a whole bunch of extra functionality. The big stuff that I'm planning to merge upstream is going to be the image and text-to-speech stuff. So the image stuff, I'll show that first because this is actually fairly similar to what I did a stream about a long time ago, and this is kind of downstream of that. Then I did a stream where we tested three different image APIs. One was Dall-E, because this is before the new ChatGPT model, and Black Forest Labs.
[00:59:07] Yeah, Black Forest Labs is right there. And we also had a different one at the time. I even forget what it was. I have to go back to the stream. But at this point, it's now using AWS Nova. And that's also partly because we're using AWS Polly for text-to-speech.
So I kind of had a reason to build on AWS stuff because it has all these different things. It's similar with Google and Azure. They have services for all these things as well, but I've just always had a lot of trouble setting both of those up. Whereas this complicated AWS is still, for me, simpler because I kind of figured out the CLI and some of their JavaScript SDK. It's not that complicated, you know?
So let me just run some of these. I'm gonna run the compare one because that runs all three of them. So this will use a prompt and run it from all three services.
[00:59:59] So it's going to be a sunset, a beautiful sunset over mountains. So this is going to run all three of them. So if you ever mess around with any of the AWS AI stuff...
01:00:14 - Nick Taylor
With which ones, AI?
01:00:18 - Anthony Campolo
Any of them. Bedrock is kind of the one you would use for LLMs. And then there's Nova. Nova actually, let me get this right, includes all of the functionality. Nova is the multimodal one. So Nova Canvas is what we're using as the image model. And then Polly is not part of Nova, which is kind of confusing. So Polly is their text-to-speech service, which is kind of separate. But Nova also includes an LLM that does, like, text and a whole bunch of other stuff.
01:00:53 - Nick Taylor
Oh, okay. Okay.
01:00:54 - Anthony Campolo
I think it might even work with video. Actually, I wanted to pull these up just in the browser. Yeah.
01:01:00 - Nick Taylor
I haven't actually used any of the AWS stuff.
01:01:04 - Anthony Campolo
They're interesting, I gotta say. So yeah, that's also the thing. It has Nova Reel. This is what a lot of things don't have, a whole video generation thing built in. So part of the reason why I was interested in Nova in particular is because it gives you almost all the functionality you need to do the show I was talking about last time we were on. I was telling you, my friend kind of wrote this sci-fi show, a cartoon, based on his family's kind of workplace.
So I've been needing to kind of figure out, okay, how do I go from a script to a show? And that's like a crazy problem, actually, just trying to solve, because you have to go from text to images to voices that then sync in a video output. And almost no tool involves all of that functionality within itself. So you have to reach out to some different services, like you use ElevenLabs for the voices, and you'd use, like, Runway or Kling for the video.
[01:02:03] This is kind of the first thing I was trying, and now I'm thinking like, well, if you're just using Nova, actually, you could do every single step. It might not be the best for each individual step, but it would be able to do the whole thing without needing to stitch together a whole bunch of different stuff.
So the video stuff is not actually in the CLI. This is all just me kind of envisioning where this could end up one day when I have the time to do it.
01:02:26 - Nick Taylor
But yeah, totally.
01:02:28 - Anthony Campolo
Nova's fascinating. And then there's Polly, which is something they had before Nova. So that's why it's kind of different, but that's the docs.
01:02:53 - Generated/demo audio
Do.
01:02:55 - Anthony Campolo
Yeah. So Polly is a cloud service that converts text into lifelike speech. So have you used ElevenLabs or have you used any of these text-to-speech things yet?
01:03:04 - Nick Taylor
I gotta give Eleven Labs a try. They were actually, Thor and his crew were right beside us at the AI World Fair for the booth. So I talked to him for a bit. But I definitely want to give it a go.
01:03:16 - Anthony Campolo
Okay, I'm getting slightly ahead of myself because we still need to look at the image outputs. So let's do that first, and then we'll get more into the text-to-speech.
So this is the Black Forest Labs. This is Dall-E, and this is not the newest ChatGPT model. This is the old ChatGPT image model. That's why it says Dall-E. And this is Nova.
So there's still a couple of things that I have to add in, which is the ability to select different models, because there is only one Nova model, but there are different Black Forest Labs models, and there are different ChatGPT models, and the new ChatGPT models are like the best. So that's one thing I have to still do before I can push this up fully.
But let me now go to the text-to-speech. So text-to-speech is a wild array of choices. I tried so many different things. I tried like ten different ones, maybe five open source and three or four paid ones.
[01:04:17] Open source stuff is really hard to get working. Right now I'm using Coqui, which is the one I found works the best. But if you go look at this, they have this.
01:04:34 - Nick Taylor
Type of frog or something.
01:04:37 - Anthony Campolo
Who knows? I asked ChatGPT. But this says here they're shutting down. So this is like an open source project that hasn't been committed to in, like, a year. It still works, but that's kind of a bummer.
There's Kokoro. There's another one. This one was the only one that actually had a working JavaScript library out of all the ones I tested. And so this is probably the next one I'm going to try and fully integrate.
But for this to work, I'm going to need to, I think, actually switch something, but let me just run this first.
So this is using just the open source thing. And part of the reason why it's also challenging is because, like LLMs, you'll have all these different models that have been trained on different stuff, that are different sizes and have different tradeoffs. So with the open source things, you're only going to get something that's not so good.
[01:05:38] You just need to download massive models, which you can do, but it takes a lot of time. It takes forever to run. So let me switch this to a tab. I think that I can. Share audio, share tab instead. Crap, a tab won't give me VS Code.
01:05:58 - Nick Taylor
Son of a bitch.
01:05:59 - Anthony Campolo
VS Code. So I don't know how to share this audio, actually.
01:06:03 - Nick Taylor
Oh, you probably can. I know Zoom does it, but maybe when you share your screen, I think you can specify that you want to do audio too. Wait, let me look here. Some screens let you share audio. Look for the Share Audio checkbox on the next window.
So okay. Yeah. So if you share your screen, there's a toggle on the bottom when you go to share, and it just says Share Audio. So stop sharing for a sec. [01:06:37] - Nick Taylor And then go click on present again.
01:06:41 - Nick Taylor
And then you're going to say share screen. Then the screen sharing tips pop up. Click Share screen. And then you're going to see the share tab audio. It should be checked off the toggle by default.
01:06:57 - Anthony Campolo
That's only for tabs. And what I have is not a tab. It's in VS Code, which is not running in the browser.
01:07:06 - Nick Taylor
Oh yeah.
01:07:07 - Anthony Campolo
Yeah, it's happening right now. I know a quick way to fix this. I can just dump it into my R2 bucket, and it'll take me two seconds to do that.
There we go. I don't want to show this on screen, so I'm just going to do this off screen real quick. If you mess around with R2 at all, Cloudflare R2 is like an S3 thing.
01:07:30 - Nick Taylor
Yeah, I know the API is compatible. I wonder if, because I know Bun has like an S3 API you can use directly in Bun, if you can just wire up R2. I imagine you could, but that's...
01:07:43 - Anthony Campolo
A good question. Potentially they have an S3-compatible API. So let me do this and that, this and that. And this should just take a quick second, I think.
01:08:01 - Anthony Campolo split
Come on now. There we go. Okay. Okay, let me go back to sharing my screen.
01:08:17 - Nick Taylor
I'm laughing on LinkedIn. You said we're live, and the thumbnail it took of you, you're like this. You're like, let's go.
01:08:28 - Anthony Campolo
Awesome.
01:08:29 - Anthony Campolo split
Good stuff.
01:08:30 - Anthony Campolo
Okay. Can you see the right thing?
01:08:33 - Nick Taylor
Yeah.
01:08:34 - Anthony Campolo split
Okay.
01:08:35 - Anthony Campolo
Let's see how this tests.
01:08:37 - Generated test voice
This is our test.
01:08:38 - Nick Taylor
Oh, yeah. I hope it came out pretty clear.
01:08:40 - Generated test voice
It's a very good test. Shaka laka doo dah a. Whoa, whoa, bang.
01:08:48 - Anthony Campolo
I say!
01:08:51 - Anthony Campolo split
The test file.
01:08:53 - Anthony Campolo
It says I'm testing. This is a test. I hope that this test is a very good test. Shaka laka doo dah. Oh, wow. Wow. Bang!
01:09:00 - Nick Taylor
All right.
01:09:01 - Anthony Campolo
Say something kind of nonsensical, so it comes out sounding different no matter which one I use.
Okay, so you're still seeing that? Let me real quick.
01:09:13 - Anthony Campolo split
Quick.
01:09:14 - Anthony Campolo
I'm just going to run the same thing with ElevenLabs and do the exact same thing.
01:09:20 - Nick Taylor
I should see if I can get Thor on the live stream. Been doing a lot of AI stuff on my work live stream.
01:09:29 - Anthony Campolo split
Mhm.
01:09:34 - Anthony Campolo
What kind of stuff are you working on right now?
01:09:37 - Nick Taylor
Right now I'm giving the talk I gave at the MCP Dev Summit, the one about improving security with MCP with zero trust. That's the talk I'm probably going to be giving a bunch of places this year.
01:09:57 - Anthony Campolo
I've already given that a couple places, right?
01:10:01 - Nick Taylor
I gave it at the MCP Dev Summit, and then since then, I haven't yet, but I'll be giving that talk at Black Hat. It's going to keep getting better as the year goes by, like most talks do when you give it more than once.
But yeah, I'll be giving it at Black Hat. I'm giving it at the end of July in SF at a private event. I don't have all the details yet. Then I'm also giving it at Commit Your Code, the conference Danny started last year in Texas.
Cool. I'm going to see if I can get in a few other places. But yeah.
01:10:44 - Anthony Campolo
Okay. So now this is ElevenLabs. You'll notice only five seconds. The other ones, like nine seconds, so it's talking a lot faster. This is one of the things you can configure. You can configure their talking speed.
01:10:55 - Generated test voice
This is a test. I hope that this test is a very good test. Shakalaka Duda a wow wow bang.
01:11:00 - Anthony Campolo
So it nailed the Shakalaka Duda. Oh wow wow. Bang. The other one, yeah, kind of confused by it. So you see how the paid models are better, like noticeably better.
01:11:12 - Nick Taylor
Yeah, exactly. That's cool. Yeah, I guess so. That's what you're working on right now, the next kind of stuff in AutoShow, or...?
01:11:24 - Anthony Campolo
Yeah. So this stuff basically works. Right now all I have left is just to bring it upstream into the main repo, which is something I front-loaded all the work for over the last week or two while I was moving. I just need to sit down and make the last couple changes.
So sometime this week, all the stuff is going to be upstreamed and then anyone can use it. It'll be public. And then after that, I'm planning on putting the last couple touches on the AutoShow app and launching that sometime before the middle of July, because I'm going on JavaScript Jabber. So I want to talk about AutoShow and be like, hey, this thing that I just launched. Okay, go use it. It has to actually be available and usable if I want that to be the case.
But just a couple tasks to do for that, because I got all the login and payment stuff working.
[01:12:14] So now I kind of have to do a bit of stuff to the styling to make it look more presentable, but yeah, so that's exciting.
01:12:22 - Nick Taylor
That's cool. Yeah.
01:12:24 - Anthony Campolo
Do you have.
01:12:24 - Nick Taylor
Any.
01:12:24 - Anthony Campolo
Upcoming things people should be looking for before we wrap it up here?
01:12:29 - Nick Taylor
Yeah, I'm doing a live stream on some more Astro stuff on the dev channel soon. I'll drop the link here in the chat. It's not on there yet. I have to put up the social stuff.
But I'm going to be hanging with Abby from DigitalOcean, and we're going to be talking about event-driven architecture. He's got a cool thing he wants to walk me through.
And yeah, I got some other interesting guests in the pipe. Hype.
Aside from that, I'm basically trying to live stream most weekday mornings Eastern time. It's like, I don't know about you.
01:13:12 - Anthony Campolo
It's just a little too early for me. I usually wake up like an hour or two after stream. I mean, yeah, yeah, but I don't know. My wife wants me to wake up earlier, so maybe I do.
01:13:23 - Nick Taylor
Yeah. No. Well, the thing is, I'm glad you brought that up, because it's like I used to be like, okay, I have to stream at this time because, you know, so-and-so's not streaming then.
And then I tried that for a while, and then I kind of gave up on it. Like, this is when it works best for me to stream. And it's not a terrible time, really. It's like 9:30 to 11:30 a.m. eastern usually, but like, obviously.
Well, one, a couple things. One, it's not like I have tens of thousands of followers, but.
01:13:56 - Anthony Campolo
Yeah.
01:13:58 - Nick Taylor
I just have to find, you know, like, it just works better for me because a lot of my team is on the West Coast as well, and I mean it is work-related what I'm doing.
But getting this out of the way in the morning, because then I have meetings come noon my time. We have our stand-up, and then any meetings I have are usually in the afternoon Eastern time. So it seems to work well.
01:14:28 - Anthony Campolo
Fuzzy's saying he got [unclear] with a few projects. That's why he hasn't hit you up yet. He was also, I don't know if we, I don't think we read these out. He was saying, I think this is in relation to Nova, it sounds similar to Bedrock.
The day they get away from stock images would be awesome. I mean, yeah, stock AI images are, they're all unique, that's for sure. Yeah, for sure, in a literal sense. They all look the same. I love how there's snow-capped mountains. There are some mountains with snow, I believe.
01:14:57 - Nick Taylor
So somebody's running a Bob Ross-like prompt through things, maybe. Yeah. I'm just going to put a little cloud up here. A little.
01:15:04 - Anthony Campolo
Friendly cloud.
01:15:06 - Nick Taylor
Yeah.
01:15:06 - Anthony Campolo
Talking about the browser text-to-speech API, which I imagine would not be super powerful if it needs to run in the browser.
01:15:13 - Nick Taylor
Yeah, it's interesting times, but I'm supposed to hang with Fuzzy on my live stream. He's just got a lot going on, so.
I mean, there's no rush, Fuzzy, at all. It's like whenever it works for you. But it'd be cool to hang. But yeah, that's...
01:15:32 - Anthony Campolo
Pretty much for sure.
01:15:34 - Nick Taylor
Yeah. Cool. Well, listen, man, I know you got a hard stop, and I'm actually going to play tennis in a bit, so it's always great.
01:15:41 - Anthony Campolo
I think we accomplished all the stuff we wanted to accomplish. I got to demo some of the upcoming functionality.
Yeah. Super fun to do this. We should do it again in a month or two, like usual.
01:15:51 - Nick Taylor
Yeah. Sounds good, man. All right. Take care, everybody.
01:15:55 - Anthony Campolo
Later, everybody. We'll catch you next time.