
Pipedream with Dylan Pierce
Dylan Pierce from Pipedream demos AI-powered web scraping with Puppeteer, showing how devs can automate workflows and extract data without writing boilerplate code.
Episode Description
Dylan Pierce from Pipedream demos AI-powered web scraping with Puppeteer, showing how developers can automate workflows and extract data without writing boilerplate code.
Episode Summary
In this episode of JavaScript Jam, hosts Scott Steinlage and Anthony Campolo welcome Dylan Pierce from Pipedream, a developer-focused workflow automation platform. Dylan shares his journey from ten years of software engineering into developer relations, then dives into a live demo of Pipedream's capabilities. He starts by explaining the platform's basics — projects, workflows, triggers, and serverless configuration options like memory limits, concurrency, and provisioned warm containers to eliminate cold starts. The conversation moves into Pipedream's newly released Puppeteer and Playwright support, which lets developers run browser automation inside serverless containers now that AWS increased Lambda size limits. Dylan demonstrates taking screenshots of websites, uploading them to Imgur, and then showcases the platform's AI-powered code generation, which uses GPT-4 with vector-embedded Pipedream documentation to write Puppeteer code from natural language prompts. A live attempt to scrape Twitter fails due to platform-level blocking, but scraping Reddit comments succeeds, illustrating how AI can help developers quickly target specific data without memorizing query selectors. The episode wraps with discussion of upcoming features like branching and looping in workflows, Pipedream's competitive positioning as a developer-first tool compared to no-code alternatives, and broader thoughts on how AI and serverless abstraction are reshaping integration work.
Chapters
00:00:00 - Introductions and Dylan's Path to DevRel
Scott and Anthony introduce the show and welcome Dylan Pierce, who works in developer relations at Pipedream. Dylan describes his background as a software engineer for about ten years before transitioning into DevRel, explaining that his love for meetups, documentation, and developer experience made the role a natural fit.
Anthony and Dylan discuss the differences between coming into DevRel from an engineering background versus a non-traditional path. They touch on how tools like Pipedream reduce the glue code that trips up less experienced developers, and how AI tools like ChatGPT are changing the game — though knowing how to prompt effectively remains essential. Dylan emphasizes that Pipedream's AI features are optional enhancements to an already fast workflow-building process.
00:05:24 - Pipedream Basics: Projects, Workflows, and Triggers
Dylan begins his screen share and walks through Pipedream's core concepts. He explains that projects function like repositories and can be connected to GitHub for version control, while workflows are essentially scripts composed of sequential steps — comparable to Zapier automations but built for developers.
The discussion covers the various trigger types available, including webhooks and cron-based schedules. Anthony and Dylan break down what a webhook actually is in plain terms, demystifying a concept that often confuses newcomers. Dylan also highlights configuration options like container memory, timeout settings, concurrency controls, and the ability to eliminate cold starts with provisioned warm workers — features that are especially important for latency-sensitive integrations like Discord and Slack bots.
00:11:17 - Serverless Configuration, Pricing, and Use Cases
Dylan explains Pipedream's VPC support for connecting to firewalled databases using static IP addresses and walks through the platform's pricing model, noting that the free tier is generous and only limits the number of active workflows and connected accounts. Enterprise features like VPC are reserved for paid plans.
The conversation shifts to practical use cases, with Dylan describing how Pipedream excels at handling the integration side-work that businesses need — like piping Zendesk tickets to Slack or syncing CRM data — without consuming developer time on the core product. He shares a fun personal project called "Dictionary," a Pictionary-style game using DALL-E inside Slack, built entirely on Pipedream, which illustrates how the platform shines when you only need backend logic and can rely on Slack or Discord for the UI layer.
00:17:14 - Puppeteer and Playwright Support on Pipedream
Dylan introduces Pipedream's recently launched support for Puppeteer and Playwright, explaining that these browser automation tools let developers programmatically control a Chromium instance to take screenshots, generate PDFs, fill forms, and scrape JavaScript-heavy single-page applications that regular HTTP requests can't render. He notes that AWS's increase of Lambda size limits from two to eight gigabytes made bundling Chrome feasible.
He demonstrates a simple workflow that navigates to his personal website, captures a screenshot, and uploads it to Imgur — all without writing any code, using only pre-built actions. The hosts discuss how this capability opens up possibilities like generating PR preview screenshots for GitHub comments, uploading to S3 or Cloudinary for image manipulation, and creating PDF reports from HTML pages instead of wrestling with PDF generation libraries.
00:24:29 - AI-Powered Code Generation with Puppeteer
Dylan showcases Pipedream's AI code generation feature, which uses GPT-4 with prompt engineering and vector-embedded Pipedream documentation to generate Puppeteer code from natural language descriptions. He demonstrates asking the AI to grab the H1 element from a webpage, and the system produces working Puppeteer code without the user needing to consult any documentation.
The hosts discuss how the AI knows the context is Puppeteer because the documentation has been crawled and stored in a vector database, ensuring accurate and relevant code generation. Dylan shares how he used this approach to build a Reddit scraper after the API became prohibitively expensive, simply pasting HTML snippets into the prompt and letting the AI figure out the correct query selectors for extracting comments, upvotes, and usernames.
00:32:36 - Live Scraping Attempts: Twitter and Reddit
Dylan attempts a live, unrehearsed scraping experiment. He first tries to extract tweets from Anthony's public Twitter profile by inspecting the DOM for data-test-id attributes, but the attempt fails — Twitter appears to block requests from AWS IP addresses entirely, returning zero content to the headless browser.
Undeterred, the team pivots to Reddit, where the HTML structure on old.reddit.com proves far more scrape-friendly, with descriptive data attributes for comments, upvotes, and reply counts. The AI successfully generates working code that extracts comment text from the JavaScript subreddit. The contrast between Twitter's aggressive blocking and Reddit's accessible structure highlights how platform policies dramatically affect what's possible with browser automation tools.
00:42:28 - Natural Language Prompting and GPT-4 Vision Ideas
The hosts experiment with whether non-developer language can drive the AI code generation, finding that without technical specificity about HTML structure, the AI falls back to incorrect approaches like using Axios instead of Puppeteer. This leads to a discussion about how developer context remains essential for reliable results.
The conversation takes an imaginative turn as Scott describes using GPT-4's image processing capabilities — taking screenshots or PDFs of web pages and feeding them directly to the model for analysis. Dylan realizes this could be automated as a Pipedream workflow: Puppeteer captures a screenshot, uploads it to OpenAI's vision API, and the model interprets the page content. The idea sparks excitement about combining browser automation with multimodal AI in novel ways.
00:52:27 - Upcoming Features, Competition, and Closing Thoughts
Dylan discusses Pipedream's roadmap, highlighting branching and looping as the most requested features. Currently, developers must use conditional logic in Node.js or chain separate workflows to handle iteration over large datasets within Lambda's timeout limits. Native UI support for these patterns is in active development and will significantly simplify complex automation flows.
The conversation closes with Dylan positioning Pipedream against competitors, noting that while many workflow tools target no-code users and some developer tools offer serverless execution, few combine OAuth management, thousands of pre-built integrations, and a code-first experience the way Pipedream does after six years of refinement. He directs listeners to pipedream.com and the community Slack, and the hosts encourage developers to try the platform for their next integration task instead of spinning up yet another standalone Node script.
Transcript
00:00:02 - Scott Steinlage
Hey everybody, what's up? Welcome to JavaScript Jam. Today we have an awesome guest here, but first I'm Scott Steinlage. If it's your first time listening to this, I am one of the co-hosts of this podcast. And right over here I have Anthony Campolo.
00:00:21 - Anthony Campolo
And I'm a developer advocate at Edio and also co-host.
00:00:26 - Scott Steinlage
Awesome.
00:00:27 - Dylan Pierce
Dylan, how are you doing? Great, thanks for asking.
00:00:30 - Scott Steinlage
Yeah, man. So why don't you tell people who you are?
00:00:34 - Dylan Pierce
Yeah. So I'm Dylan Pierce, but I go by Pierce at Pipedream because there's another Dylan and I work in developer relations over at pipedream. I am a software engineer turned developer relations person. So I can tell you about the code at Pipedream and how to use it. Yeah.
00:00:52 - Anthony Campolo
So how long were you engineering before you started doing DevRel?
00:00:56 - Dylan Pierce
I mean, I started tinkering when I was a young kid, like 10 or 11, but professionally I probably. I made my first SaaS product when I was 20.
00:01:10 - Anthony Campolo
So how old are you then when you got into DevRel?
00:01:14 - Dylan Pierce
Just this past year. So ten years of software engineering and then one year of DevRel.
00:01:19 - Anthony Campolo
So what inspired the shift? I find this really interesting because I'm someone who does DevRel and I never did engineering beforehand. I went to a bootcamp, but I got into tech through doing DevRel roles. And I know for a fact there's a lot of things that people who do this don't know about how to actually do software engineering. It's like we do our best, give good information, and hopefully learn the tool enough to explain it. But there's a breadth of experience you get from 10 years of engineering that you can't get from your bootcamp. Like, you just can't.
00:01:51 - Dylan Pierce
That's true. But I would also say what I love about DevRel specifically is developers are really tool-based. Your choice of tools is based on documentation and how big the community is. And as a developer, you can make this choice based on where you think it's going and the health of it. I also went to meetups naturally because I love talking to other developers. And when I realized you could do both these things and get paid for it, it's like, what a great gig. The networking part of it and the opportunity to write content about really cool tools and help make it easier to get integrated, which is half the battle. I call it Internet plumbing. We're all Internet plumbers, in a way. We're just connecting services or even lower-level APIs together. But to do that effectively, you need solid documentation and an easy developer experience, which I'm very passionate about.
00:02:51 - Anthony Campolo
Yeah. And that really is what makes it a lot more accessible for people, because you can give people tools that let them build things with niche knowledge. We can make software less complicated. It's a thing that we can do. And with tools like Pipedream, you're connecting services that would require so much glue code. That's where less experienced developers can struggle because you only really get better at doing that through iteration and doing it a bunch of times. Then you learn the different idiosyncrasies. Now ChatGPT changes a lot of this because you can start with generated code, but even still you need the knowledge to know how to ask the right questions so you can get the correct code.
00:03:32 - Dylan Pierce
Exactly. Yes, I totally agree. Knowing how to prompt is half the battle.
00:03:41 - Anthony Campolo
Yeah, yeah. And so you're going to show something to do with AI today. This is like such a kind of cliche at this point. We bring on someone, they're like, so what's your current AI thing that you're doing? We just wanted to show their app generator before we had open sauced on to show us their AI like analyzer thing. So you're the third person to come on and be like, here's our AI thing. And I think it's great. I think everyone should find 10 ways to try and put AI in their apps.
00:04:10 - Scott Steinlage
Yes.
00:04:10 - Anthony Campolo
And then if nine of them fail, that's fine. You just found one that's going to change the game totally.
00:04:15 - Dylan Pierce
Yeah. And just to make it clear, this is not a forced path in the app. You could totally use it without AI. This is just a kind of quick start and an already really fast process
00:04:29 - Anthony Campolo
to build based on understanding a site to scrape it, I think is what you're kind of describing. And that to me sounds really useful because this is something that ChatGPT lacked for a while. They gave browser support for a certain amount of time and it worked really poorly, and then they took the feature away for a while. So this is something they've really struggled with. So I'll be curious to see how well this works. Like, look at a webpage and understand it.
00:04:55 - Dylan Pierce
Oh, okay. So that is a bit deeper than what it can do, but I have found a way around it.
00:05:03 - Anthony Campolo
You Said you can fill out forms, though.
00:05:04 - Dylan Pierce
They can fill out forms.
00:05:06 - Anthony Campolo
So that means. That means to a certain extent it understands that there's inputs that take data on this website. It knows something about that website.
00:05:14 - Dylan Pierce
That's true. Yeah. Yeah. So you can describe the HTML content in the prompt and it does a really great job of filling.
00:05:24 - Anthony Campolo
Bump up your font a little bit.
00:05:26 - Dylan Pierce
Oh, sure. So you can see my screen. That's good. Is that good or should I show a little bit?
00:05:31 - Anthony Campolo
I think that's one more, I think.
00:05:33 - Dylan Pierce
Sure.
00:05:34 - Scott Steinlage
Do it.
00:05:34 - Dylan Pierce
Yeah.
00:05:34 - Anthony Campolo
One more.
00:05:35 - Dylan Pierce
Yeah.
00:05:35 - Anthony Campolo
So it's good to just be a little.
00:05:38 - Scott Steinlage
Perfect.
00:05:40 - Dylan Pierce
Good.
00:05:40 - Scott Steinlage
This looks great. Yeah.
00:05:41 - Dylan Pierce
Cool. So I mean, just to take a step back for anyone that's never even seen Pipedream before, in Pipedream, you create a project first, a project you can think of as equivalent as a repo, and that's because you can connect it to a repo. You could serialize all your workflows within one GitHub repo. It almost makes like a monorepo of sorts. So just for speed, I won't turn this on, but just know that you can connect your GitHub account and you'll make commits and deploy your version control rollback and stuff.
00:06:15 - Anthony Campolo
Yeah.
00:06:17 - Dylan Pierce
So we'll call this one VM stack JavaScript. VM stack. Cool. Now we have the ability to make folders and workflows. And in this case, what is a workflow? That's a great question. A workflow is basically a script. If you're coming from a developer background, you can think of a workflow as basically a script. It's a series of steps. If you're coming from a no-code background, it's kind of like a Zapier automation or just an automation of sorts.
00:06:53 - Anthony Campolo
One of the reasons we got in touch in the first place is because we had referenced you in relation to Val Town, which we've been doing an episode on. And I was thinking, this reminds me of Pipedream a lot. And so if people check out that episode, you'll also see kind of a different take on this.
00:07:11 - Dylan Pierce
Exactly. And what's funny about that is an automation, like a workflow of ours pinged us when you mentioned us and that's how we joined.
00:07:19 - Anthony Campolo
Exactly. You use Pipedream to get notified about someone on the Internet. It's like the Bat-Signal.
00:07:26 - Dylan Pierce
Exactly. So let's start with the simplest use case, like grabbing a screenshot of a website. Let's say you make a PR for your Redwood site, and you want to automate this workflow to grab a screenshot of your change and then post it as a GitHub comment, just as an example. So we'll just call this screenshotter. Here you have more detailed controls for the underlying serverless container that runs this workflow. So with this tool you're not worrying about DevOps things at all. You're just setting container memory and timeout. You can even control concurrency and execution, which is pretty cool. And that way you can respect API rate limits without...
00:08:15 - Anthony Campolo
you make it super explicit about setting things like the timeout and the memory, because this is stuff that is so hard to get to when you use a lot of these services that wrap things on top of Lambda.
00:08:26 - Dylan Pierce
Yes. Yeah. So I am firmly in the camp too that Lambda is very unapproachable for the average developer, but I would argue this abstracts that away and makes it approachable and makes you more productive because you're not worrying about, like, wait, what's that param again I need to set up for my bash script to deploy this thing? You don't have to worry about that. It's all in the UI. You can even eliminate cold starts too.
00:08:58 - Anthony Campolo
How does it eliminate cold starts? You have a so provisioned concurrency. Is that what they call it?
00:09:03 - Dylan Pierce
Exactly. So you can set up a number of workers, aka containers, that will stay warm. For those who don't know about AWS Lambdas or any serverless platform, you save on cost because it's on-demand, but as soon as the function isn't called within like five minutes, it'll go into a resting state because it's expensive for AWS to keep that code living anywhere in its us-east-1 region. So they have to basically take your code and put it back into cold storage and then warm it back up again.
00:09:39 - Anthony Campolo
Yeah, the funny thing about cold starts is it's the type of thing that if you have an app where people can wait like a couple seconds, like, it's not really a big deal, but if you have an app that like starts up on a Lambda, you're screwed because you've just caused your entire app to always load slowly every single time. And this is a huge problem actually for Redwood, that required us to kind of move away from making it lambda only.
00:10:09 - Dylan Pierce
Oh yeah, yeah, yeah. So I had that same problem with Next.js and my solution was actually to create a workflow in Pipedream that just takes an array of URLs and pings them with a query param called warming. So I keep my critical routes warm, such as install, change a setting, and subscribe. Like, heaven forbid your customers can't subscribe because they're waiting five minutes for the worker to start up.
00:10:36 - Anthony Campolo
Yeah.
00:10:37 - Dylan Pierce
And it's.
00:10:38 - Anthony Campolo
That was the original solution when it was first discovered and now today it is still the solution and will continue to be the solution for as long as the technology exists. Because it works the way it works and it's not going to change. That's kind of my opinion on it.
00:10:50 - Dylan Pierce
Yeah. Yeah. So there's definitely limitations to serverless. It's not the end all be all.
00:10:57 - Anthony Campolo
But you let the dev click a button that fixes the limitation, which is nice.
00:11:02 - Dylan Pierce
Exactly, exactly. So this is critical for folks that are using like Discord Bots or Slack because they have a timeout window. You have to respond within, I think, like 10 seconds or 5 seconds. And this will allow you to keep your workflows warm too.
00:11:17 - Anthony Campolo
And why would you want to run it in a vpc?
00:11:20 - Dylan Pierce
So this is fancy for saying outbound HTTP requests. So say you're connecting to a private database that only allows specific IP addresses to connect for security concerns. Or maybe you're running like an app, like an in-house API that's. That needs to be firewalled. Here, attach it. Yep. So all outbound HTTP requests from this workflow will use this specific IP address. 523-211-3. That's what it does. Yeah.
00:11:52 - Anthony Campolo
Sweet.
00:11:54 - Dylan Pierce
So I won't use these fancy features because we're not enterprise-y and we don't care about $100. Please. It's not that expensive. I'm shocked at how inexpensive this solution is. Yeah.
00:12:06 - Anthony Campolo
So actually, is there a free plan on Pipedream or do you start with a certain monthly limit?
00:12:12 - Dylan Pierce
No, it's free. The only limitations are the number of workflows that you can have active and the number of connected accounts, such as Slack and Google Sheets, out of the thousand-plus services that we integrate with. That's the limitation. And you won't get some of these more fancy features that are more enterprise-y, like the VPC one. Like, if you're worried about VPCs, then you probably have someone to pay for that. Yeah, like you have a budget essentially.
00:12:43 - Anthony Campolo
Exactly. Cool. Yeah, I like that. Because the type of dev who wants to spin something up, like getting one service to automate one task, can be really Useful and valuable if it's like the right task.
00:12:56 - Dylan Pierce
Exactly. I think there's two types of work at any business. There's the product work that makes your business unique and actually adds value to the customer. And there's this side integration work that marketing or customer service needs. They're like, I need to pipe my Zendesk tickets to Slack and know when things are finished, or marketing needs customers added to a CRM. That's still important, but it shouldn't take up developers' time. And if it does, it should be really quick and easy to change and flexible, which is where Pipedream really shines.
00:13:39 - Anthony Campolo
People use integration that you feel like makes you really high leverage that you enjoy.
00:13:46 - Dylan Pierce
I use Pipedream quite a bit. I love using it with like Slack. Especially like I've created an app called Dictionary which is like Pictionary but uses Dall E. So you get asked a prompt like please describe key value data store and you try to create the image in Dall E and it's a game, it'll share it within a Slack channel and whoever guesses it is the next person to go. I built that all in Pipedream and it really shines because you don't have to worry about the UI layer. Like for Discord and Slack they give you the tools to, to create a UI so you just worry about the back end. That's where Pipedream is really really nice. You don't have to worry about backend or frontend at all. Someone else is handling it for you. I'm just going to try this again. I'm not sure why
00:14:40 - Anthony Campolo
Mine timed out, we were talking so long.
00:14:43 - Dylan Pierce
Probably it probably is a Lambda that died. Okay, all right. So we finally made it to the workflow start screen. This is where you define the trigger. So we have many different types of triggers. The most popular one is creating a webhook or an HTTP endpoint. So this will be a unique URL, automatically generated, that you can use to trigger this workflow, aka it's just an API endpoint. You can even generate a test event with a Postman-like thing to test your workflow from this initial setup.
00:15:21 - Scott Steinlage
That's pretty cool. Yeah, that's what I use, the webhook to get information from input on a form on ClickFunnels over to another process. But anyway.
00:15:34 - Dylan Pierce
Oh cool. Cool. Yeah that's like that's the bread and Butter.
00:15:38 - Anthony Campolo
But so real quick, when I was first learning to code, people used to always talk about webhooks and it confused the crap out of me because I never actually like sat down, shown just
00:15:48 - Dylan Pierce
like what it was.
00:15:49 - Anthony Campolo
That's like a little hunk of code that like runs on. Like if people talk about webhooks, like all the things they would do with it. And it was always very abstract for a long time, like, what is a webhook? Then I would see examples like this. Like, okay, now, now I get it. But when I just heard people talking about it, didn't make any sense to me.
00:16:05 - Dylan Pierce
Right. It's just like, why don't just call it a reverse API call? We all understand what that means. It's API call coming to you rather than you're going out and asking some service, you know. But somebody called it a webhook and that just kind of stuck for
00:16:20 - Scott Steinlage
some reason because you hook it and you pull it in.
00:16:24 - Dylan Pierce
Yeah, I guess.
00:16:25 - Scott Steinlage
I don't know.
00:16:27 - Dylan Pierce
It's kind of like a web push.
00:16:28 - Scott Steinlage
Yeah, push.
00:16:29 - Dylan Pierce
Yeah, yeah, push. Yeah. So that this is one type of trigger. I mean, we have many, many, many. You can set a custom interval. So this is more like a cron. And you can even set a cron expression.
00:16:41 - Anthony Campolo
Do you mean just like a Node script, right? Yeah. Like JavaScript. Right, like HTT webhook, but just with like a hunk of node code.
00:16:51 - Dylan Pierce
Yes. So that's called a source. I don't know if this is the
00:16:54 - Anthony Campolo
first thing I tried to do when I use this.
00:16:56 - Dylan Pierce
Oh, okay. We don't have built in like a code editor in here. That is something we're talking about actually, though, you can publish custom sources or custom triggers. We have a bunch of documentation on how to do that. It's just called.
00:17:14 - Anthony Campolo
Yeah, we don't have to go into that then, because I know your other example is going to be pretty, pretty involved. So what are the services you're going to use to create this like, AI thing?
00:17:25 - Dylan Pierce
So we're going to take a screenshot of a. Of a site. We're just using my personal site. We just released the support for puppeteer and playwright out of the box. Pipedream. I should probably back up. Pipedream supports Node.js code, Python code, and what's neat about.
00:17:46 - Scott Steinlage
Right, Exactly.
00:17:48 - Dylan Pierce
Oh, I thought you meant the trigger. My bad. Yeah, in a step. Yeah.
00:17:51 - Anthony Campolo
Just like this. This is like a node editor right here. And you're using Axios to make the fetches.
00:17:57 - Dylan Pierce
Yeah, you can. You can import any NPM package. There's no package JSON to worry about. You just literally test the code and it will execute this code in a test Lambda and voila, it works. But the problem we had for the longest time was getting certain NPM libraries, namely Puppeteer and Playwright, to work. Because Puppeteer and Playwright rely on Chrome and Chrome. Chrome needs to be built for Lambda and it's also huge. But AWS recently increased the Lambda size from like 2 gigs to 8 gigs, which solved the size problem for us. The problem was getting the right versions of Puppeteer and Playwright to work.
00:18:39 - Anthony Campolo
Can you actually explain what Puppeteer and Playwright are for people who've never used them?
00:18:43 - Dylan Pierce
Yeah. Good call. So Puppeteer and Playwright are browser automation tools. They are a thin wrapper, like a Node.js wrapper, around Chromium, which is the open-source version of Chrome, and they allow you to programmatically click and type. Any action you could take in a web browser, you can do using Puppeteer and Playwright. What's different about Playwright versus just using a regular HTTP GET action and getting the HTML is that it renders the whole site in a Chrome instance. So think JavaScript-heavy apps like SPAs that wouldn't work with a regular HTML request because it's returning a JavaScript blob. You need to render it in order to see the actual hydrated content. So Chrome, Puppeteer, and Playwright render JavaScript, and you can even invoke JavaScript inside of them. They are really useful browser automation tools and can perform things like screenshots, getting PDFs, clicking buttons, navigating the page, and entering forms. It's really useful for browser testing too. It's a long-winded explanation, but hopefully that helps.
00:20:06 - Anthony Campolo
No, no, I think that's great because this is another tool that some people use it all the time. Some people, they don't really have a reason to delve into these kind of things. But I think that something like getting a PDF, if you look at something like Claude, you can give it a PDF that's like very, very large and it can summarize things. So you can create pipelines now where you just do this whole thing where you create stuff and then you suck it in, then you put in this other thing. These tools are becoming more and more useful, more and more high leverage as we have more and more ways to integrate these different data sources by just piping things to models.
00:20:43 - Dylan Pierce
Exactly. You can use get PDF to generate the PDF. There are libraries we could use to generate PDFs from code, but they're notoriously difficult to work with. So if you know HTML and CSS, instead of learning yet another library that generates PDFs, you can just put up a webpage and use this get PDF action to generate a PDF instead. So you just leverage technologies you already know. Yeah, so that's kind of the gist of Puppeteer and Playwright and what they can be used for. And how this works under the hood is we published a special package that has the built-in Puppeteer version with Chrome bundled. So you import this special NPM library and then create a brand-new browser. And from a browser you create a new page. It's familiar, just like opening Chrome on your local computer. Open a new page, go to the URL, and then you can perform actions on the page, get the content, get the title, and it will do that for you. So that's kind of the basics of how it works.
00:21:57 - Dylan Pierce
But what's really, really neat that I want to show you guys is yeah, you can see it create, grab the title, grab the content.
00:22:05 - Anthony Campolo
Your camera just clicked off.
00:22:08 - Dylan Pierce
You know what, this camera, I can't adjust the, the sleep settings and I'll always die at 25 minutes. I'm in the market for a new camera.
00:22:19 - Anthony Campolo
Oh my God.
00:22:19 - Scott Steinlage
Interesting.
00:22:20 - Dylan Pierce
Yeah. Do not get the Mark 1 Canon M50. That's.
00:22:25 - Scott Steinlage
Oh, got it.
00:22:26 - Dylan Pierce
Yeah. Mark 2, they fixed it.
00:22:28 - Scott Steinlage
Yep.
00:22:30 - Dylan Pierce
But to continue on with Pipedream, you can build pre-built actions that leverage Node.js code and kind of make a nice little wrapper on top of it. Or you can use Node.js like I just showed you. So for the most simple flow, we can use my site and just take a URL. There's a bunch of different options where you can adjust the size, change the path, etc. Then it will spin up a browser, navigate to that page, and perform a screenshot. So we're going to wait a second for the screenshot to finish. I mean, it's running Chrome. Here we have it: it's a base64-encoded screenshot, and then it's as simple as uploading it somewhere else to view it. Imgur is kind of my favorite because it's just so easy to use. Just upload an image action. I chose my Imgur account and pasted the path from that past step into the image action, and it will upload to Imgur. So we just connected a scraper plus Imgur for uploading, and we really didn't write any code. So go over to Imgur and check out my pictures.
00:23:56 - Dylan Pierce
Hopefully we see it. Yeah, there's my homepage. So you can imagine that, instead of using Imgur, you could use GitHub and create a comment on a PR. You can use S3, upload it to S3, and pipe it anywhere. You can even pipe it to an image manipulation API, like Cloudinary, and manipulate the image. It's pretty mind-blowing what you can do by piping together these steps. So that's kind of the basic use of this library. For the AI stuff, we did screenshotting, but you can also use AI with Puppeteer. And this is kind of a sample of the AI code gen that's available in any app that we support. So instead of having to look up Puppeteer's documentation and figure out how to grab the H1 from a website, let's just get the H1. We can ask the GPT-4-trained AI this question.
00:25:08 - Anthony Campolo
So this is accessing OpenAI's API directly.
00:25:12 - Dylan Pierce
Yes. And we have a little bit of prompt magic to guide it to know that it's talking about Puppeteer. It knows that the context is Puppeteer, so it will include that in the answer.
00:25:25 - Scott Steinlage
Created an agent?
00:25:28 - Anthony Campolo
Yes.
00:25:29 - Dylan Pierce
Yes.
00:25:30 - Anthony Campolo
Yeah, it's not really embedding. It's. It's like prompt engineering. Because most, even ChatGPT, it has a. It comes with its own prompts hidden underneath. They don't like you knowing how to see it. But that's just kind of normal for how we set up these bots. Because an agent would be if you had it do a series of tasks. Right. When people use the term agent, they're talking about a thing that can actually make decisions. Like it does. You tell it to do a thing and it does multiple steps, it'll go from one to the other. Like that's what an agent is in the current modern usage of the thing.
00:26:09 - Dylan Pierce
Yeah, you can kind of think of it as this particular prompt being hyper-focused on Puppeteer and its documentation. It's more like fine-tuning specifically to this app. So when I ask it, like, please grab the H1 from the webpage, I don't need to tell it that we're talking about Pipedream code here. We're talking about Puppeteer, please use Puppeteer methods. It's aware that it's in Pipedream and it's using Puppeteer. So it'll generate code that uses that NPM package, launch the browser, go to my webpage, and grab the H1. So it's pretty wild.
00:27:02 - Scott Steinlage
But in your prompt, I'm sure, like prior. So on the back end here you're like telling it to use that certain package because it wouldn't know that because it's Back to when. 2021. Right.
00:27:14 - Anthony Campolo
So when did Puppeteer create Pipedream?
00:27:18 - Scott Steinlage
Just was recent though.
00:27:20 - Dylan Pierce
Correct. We had to embed all of Pipe
00:27:23 - Scott Steinlage
Dream documentation inside of that.
00:27:24 - Dylan Pierce
Yeah, inside of this. Yeah, exactly.
00:27:27 - Scott Steinlage
Did you guys use vector DB for that or does it just. Is it just solely prompts?
00:27:32 - Dylan Pierce
No, there is a vector storage for all. Like we do crawl all Pipedream documentation.
00:27:39 - Anthony Campolo
That's the only way to ensure it's actually going to know. What the heck.
00:27:41 - Scott Steinlage
That's what I was wondering.
00:27:42 - Dylan Pierce
Yeah.
00:27:42 - Anthony Campolo
Yeah, that's really fascinating. How much data does that take up?
00:27:47 - Dylan Pierce
You know, that's a really good question. I don't know. I would have to take a look. I wonder if I'm just typing this wrong. So for some reason during the demo, it did not pull in the correct NPM package. It tried to grab our Axios, like a custom Axios package, which is incorrect, but I believe that the, the Puppeteer code itself is correct. And. And for those who want to interact with the Pipedream AI without, you know, making an account, you can go to our Slack and talk to our AI agent there. It's also trained. So cool. It grabbed the H1 and it returned it. I use this feature extensively to create a Reddit scraper because Reddit, the API has been locked off. You have to pay a ridiculous amount of money to use it. So now I have a scraper that I didn't have to go in and like, remember all the query selectors. I gave it a snippet of HTML in this prompt and I said, please grab the Reddit comments from this HTML and gave it an example and it was smart enough to pull out the individual, like upvotes, name, comment, etc.
00:29:00 - Anthony Campolo
Yeah, this is super cool, actually. I remember when I was first learning to code and I was looking to like, look at videos and like, you know, what could you do? And just web scraping as like a skill was something that came up a lot and a lot of people were using like Python scripts and stuff like that. So it's very unapproachable for me at the time. But like, this is an extremely approachable way to scrape a website and actually endpoint pull a specific piece of data off it. That's the hard part. If you're just making a call, getting a huge chunk of HTML and all this manual, like, that's hard to even know what to do with that. Whereas this is just giving you kind of like, I just want this part of the website so you can target the exact data you need really easily with like very little code.
00:29:43 - Dylan Pierce
Exactly. Yeah. There's a whole generation of web scraping apps that are built on AI for that purpose. For someone to easily create a scraper from text prompt. And here you get it for free, which is kind of cool. But again, you have to be a developer to know how to use this stuff.
00:30:02 - Scott Steinlage
I would say that however, learning to scrape is beneficial because you really have to dig around on the web page inside of the code and you know, and to. To really, you know, in developer mode or whatever, to. To determine like what you're trying to pull from, you know, to. To scrape. Exactly. So it really helps you to, you know, learn. Learn more about that.
00:30:34 - Dylan Pierce
Yeah.
00:30:34 - Scott Steinlage
At least personally. So. Yeah.
00:30:36 - Dylan Pierce
Yeah.
00:30:36 - Scott Steinlage
But this is super cool. It makes it easy though, if you don't want to go through that process, like jump in here and. And go have a go.
00:30:42 - Dylan Pierce
It's awesome, right? It works great, especially if you know the element is going to exist. Any good webpage is going to have an H1. Another good example for SEO or marketing devs is pulling all the meta information.
00:31:01 - Anthony Campolo
So yeah, like description and OG tags.
00:31:05 - Dylan Pierce
Right, like pull all meta tags from Pipedream. Hopefully it's smart enough to realize I'm talking about an array. Yeah, so it's grabbing all the tags, getting attributes, returning as an object, and using the right NPM package. Cool, use the code. So for known structure ahead of time, of any generic webpage, fantastic. There is a little bit of elbow work to get it to play nicely with custom elements and such.
00:31:45 - Scott Steinlage
Yeah, I guess you would still need to dig into that.
00:31:48 - Dylan Pierce
Yeah, yeah, exactly. But you know, I. I find it a lot easier to do that and just copy HTML and paste it into the AI prompts than it is to look up the documentation. Yeah. What's the puppeteer? How do you spell puppeteer?
00:32:07 - Anthony Campolo
Yeah, I'm constantly just grabbing huge hunks of code and throwing them into ChatGPT and asking questions like, how do I do this?
00:32:14 - Dylan Pierce
Yeah, exactly like. And here you are. Like, I, I'm still old school. I still like Google more than I use AI. But yeah, like you're going through stack overflow. Like, and which of these answers are updated?
00:32:23 - Scott Steinlage
Like, yeah.
00:32:24 - Dylan Pierce
Wasting so much time. But this just worked. I mean, with a, with a static known structure of a page.
00:32:34 - Anthony Campolo
Great.
00:32:36 - Dylan Pierce
Awesome. When it comes to more involved, like, we can try Reddit. Let's see if we can get Reddit to work.
00:32:45 - Anthony Campolo
Twitter, actually.
00:32:46 - Dylan Pierce
Twitter, I thought Twitter, you have to be.
00:32:51 - Anthony Campolo
They require an account now that I think that was. I'm not sure if that's still the case. Let's see. See, let me see. I can see myself in incognito.
00:33:00 - Dylan Pierce
Yeah. See if you can share a link and we can try to scrape it. That'd be fun to do.
00:33:05 - Anthony Campolo
Yeah, you can just do my Twitter. It's just AJC web dev.
00:33:09 - Dylan Pierce
I'll put it in the, the chat ajc. I have this weird audio dyslexia. I have to like talk it out loud. Oh, it is public. Look at that. All right, cool. So we're going to take a look at the elements here. We're going to try to.
00:33:27 - Scott Steinlage
Do you want to share that screen or. It's up to you.
00:33:30 - Dylan Pierce
Oh, yeah, I got you.
00:33:32 - Anthony Campolo
Yes. We're only seeing this one screen.
00:33:34 - Dylan Pierce
I did it in incognito, so I'll just go through here because I know it's the same. So the way I look at it here, I can see they're doing some like.
00:33:43 - Anthony Campolo
We're still not seeing your screen right now.
00:33:45 - Scott Steinlage
Yeah, no, we're just seeing like the. I think you're only sharing the this page page. Yeah.
00:33:51 - Dylan Pierce
Let's see if I can. Instead of your browser, I guess. Share screen. Share screen. Oh, window.
00:33:59 - Anthony Campolo
When you're presenting, one will only show a single tab versus one shows like your whole desktop.
00:34:03 - Scott Steinlage
Right, gotcha.
00:34:05 - Dylan Pierce
Is that better?
00:34:07 - Scott Steinlage
Yeah, there we go.
00:34:10 - Dylan Pierce
So we're going to try.
00:34:12 - Anthony Campolo
Go ahead and increase your font a couple times again though.
00:34:17 - Scott Steinlage
Cool.
00:34:17 - Anthony Campolo
Yeah.
00:34:18 - Dylan Pierce
All right, so we have the, the Twitter HTML open and Chrome dev tools. We're looking for some kind of pattern that we can guide AI to.
00:34:31 - Scott Steinlage
Right.
00:34:31 - Anthony Campolo
Just try and grab a work.
00:34:33 - Dylan Pierce
Yeah, so there's something nice here. Data Test ID equals tweet. That's cool. I would probably use that because there's so much just like auto generated CSS. Like these class names are definitely unique and generated at build time. So we know we can use this Data Test id and we know that within Data Test ID there are there's content, eventually a span. The span has no other interesting.
00:35:04 - Anthony Campolo
It's a span inside a div. Inside div. This is what people talk about about div soup and how it makes almost impossible to find any structure within the HTML.
00:35:14 - Dylan Pierce
But here's something cool. Data, test ID text, tweet, tweet text. We could probably just grab this instead.
00:35:22 - Scott Steinlage
Right?
00:35:22 - Dylan Pierce
Tell it. I'll go back here. This is totally untested. I've never tried to scrape Twitter before.
00:35:32 - Scott Steinlage
That's what makes this so cool.
00:35:34 - Dylan Pierce
Please visit your tweeter and extract the inner HTML or not inner HTML. We care about the span contents within the div with the attributes. And check my Raycast here, See if that's smart enough to realize that we have an array of these test attributes. Right.
00:36:17 - Scott Steinlage
If it'll continue.
00:36:19 - Dylan Pierce
Yeah. So we're hoping. Yeah. It is finding the div. It's finding the span within and then it's going over each of them and trying to retrieve. Retrieve the text content of the spans.
00:36:29 - Scott Steinlage
It looped it.
00:36:31 - Dylan Pierce
Yeah. This will be crazy if it just works out of the box.
00:36:40 - Anthony Campolo
Like live streaming do on. On tweets and just. Yeah, I think locked down their API recently.
00:36:49 - Scott Steinlage
I was going to say more expensive. Oh yeah. But Twitter's going to like be like, oh, someone's scraping us now.
00:36:56 - Dylan Pierce
Well, the nice thing about Piper and like I mentioned before is it spins up a lambda anywhere in US East 1 by default. So like the traditional tools are kind of hard to. He can't detect Pipedream's IP address.
00:37:11 - Scott Steinlage
Right.
00:37:11 - Dylan Pierce
It didn't quite work this time. It's too bad.
00:37:14 - Scott Steinlage
Yeah, I bet if we tweaked it, we could figure something out.
00:37:16 - Dylan Pierce
But yeah, test id. Let's just try to see if it can even find that tweet text. I wonder if it's screwing up on the inner text. I'm not sure if that's a valid C JavaScript.
00:37:35 - Scott Steinlage
Sometimes it'll hallucinate that stuff.
00:37:37 - Dylan Pierce
Yeah, yeah.
00:37:37 - Anthony Campolo
Well, yeah. Is it running and because if it's running in Node, if it tries to run like a browser API, it's like a DOM thing. It's trying to run.
00:37:47 - Scott Steinlage
What's the temperature you guys have on this? Can you recall off the top of your head?
00:37:51 - Dylan Pierce
Oh, sorry, I do not know that. But we can return this and we'll just log out what the element is. Can you even find an element? Let's see. I have a feeling like innerHTML is going to just do better. I think innerHTML is a valid. Oh, shoot. Valid attribute. But we'll see. But this whole thing kind of highlights how easy it is to make tweaks and then retest something. Like, I'm not going to my. Between my terminal and my web and my VS code. I'm just clicking test and rerunning. Rerunning things. And I could see the logs. I have a feeling that there's a bug in the code gen that's running the old code on the first run and you're seeing it live. But I'll test again and find out. But that's kind of like the nice thing about using this system is it's not finding any at all.
00:38:58 - Anthony Campolo
It's just incredible how much you can do without needing to even be in a code editor.
00:39:03 - Dylan Pierce
Yeah.
00:39:03 - Scott Steinlage
Yeah.
00:39:06 - Dylan Pierce
Did we. Did I pick the wrong thing? I wonder if we have to wait for this stick to exist yet. Like, there's. If this is an spa, then we have to wait.
00:39:14 - Scott Steinlage
Oh, yeah. Tilt loads.
00:39:17 - Dylan Pierce
Yeah. And. And wait for an instance of to appear.
00:39:28 - Scott Steinlage
Let's see what it puts in the code. I'm curious. This is cool.
00:39:32 - Dylan Pierce
I don't remember how to wait for stuff. So.
00:39:36 - Scott Steinlage
Yeah.
00:39:36 - Anthony Campolo
You're like, don't do it.
00:39:37 - Scott Steinlage
And it's gonna.
00:39:38 - Dylan Pierce
Hopefully do it.
00:39:39 - Scott Steinlage
Yeah. Await, Await page. What's it say? I can't breathe in your bed.
00:39:48 - Dylan Pierce
That can't be it.
00:39:50 - Anthony Campolo
So it might actually. So I'm trying to curl it and you can't do that. So it might be that you have to actually open it in a real browser and they're able to, like, tell that you're not doing that. Maybe.
00:40:04 - Scott Steinlage
Yeah.
00:40:04 - Dylan Pierce
Maybe we hit some kind of security thing. So let's just return the content. Like the whole page content.
00:40:12 - Anthony Campolo
If you're getting back anything.
00:40:14 - Dylan Pierce
Yeah. And then we'll just export that real
00:40:16 - Scott Steinlage
quick and then do it from there.
00:40:20 - Dylan Pierce
Yeah. And see what it looks like. I have a feeling that Twitter is smarter. Elon Musk and his dev team are probably on the lookout for this kind of stuff.
00:40:35 - Scott Steinlage
I'm sure
00:40:37 - Dylan Pierce
if you're being charged 42. A thousand dollars a month for the entry level for better access.
00:40:42 - Scott Steinlage
Right. You better be blocking all the different ways.
00:40:44 - Dylan Pierce
Yeah. So it's zero. They might even be blocking all of AWS, US East1IP IP addresses. Like, you're not getting any content whatsoever. So I'm sorry, viewers. You just watched us hit some. It was fun.
00:41:00 - Scott Steinlage
Yeah, it was cool.
00:41:02 - Anthony Campolo
It was interesting because it shows how every platform is different and that's why having all these integrations gives you so much leverage.
00:41:09 - Scott Steinlage
Yeah.
00:41:09 - Dylan Pierce
You can use this with Reddit.
00:41:11 - Scott Steinlage
Yeah, let's do it.
00:41:12 - Dylan Pierce
Yeah. R slash, let's say JavaScript. So we care about.
00:41:19 - Scott Steinlage
The problem is they're not charging enough, I guess.
00:41:23 - Dylan Pierce
Well, also, I noticed how easy it is. Like, have you ever looked at the Reddit source, HTML source before? Uh, this is pretty wild. It's like it's built to be scraped.
00:41:34 - Scott Steinlage
Probably because they did it themselves for some certain. Some reason.
00:41:38 - Dylan Pierce
Yeah, they probably like, for example, some
00:41:41 - Scott Steinlage
sort of analytics or something.
00:41:45 - Dylan Pierce
It gives you the ID of the actual comment in the. It tells you how many posts, data dot replies.
00:41:52 - Scott Steinlage
See, they're using like mix panel and they're like, let's just make this really easy.
00:41:55 - Dylan Pierce
Yeah, it gives you the data, subreddit, that kind of stuff. So we'll tell it to go here. Let's make a new. A new one. We're going to scrape Reddit instead because they're a lot more friendly to folks like us. So puppeteer use AI. Please visit this page and grab all. What do we care about? Maybe the comments, like the actual comments themselves.
00:42:28 - Anthony Campolo
Yeah, you can like grab those and run sentiment analysis and be like, do people like my thing?
00:42:33 - Dylan Pierce
Exactly, yeah. Do people like my thing? So within the comment there's a data type equals comment. Cool, that's pretty descriptive. And within those comments we see the. There's a paragraph. So there's a P tag within parent. I think that's a good way to do it. Or a div class MD with a P tag within that. Or we could just say that the markdown. Oh, you think so?
00:43:22 - Scott Steinlage
Maybe that's why I would label it that, bro.
00:43:28 - Dylan Pierce
So we'll tell it to misspelled divs those devs. And within these comments, extract the P the value of the P tag within the div. I mean, that's pretty terrible English, but I think the AI is smart enough to figure that out. Go to the page, extract all the comments within the comments, select the. Yeah, that might work.
00:44:07 - Anthony Campolo
And you should make sure to point out that you need to use old dot Reddit. This is because there's multiple like Reddits.
00:44:15 - Dylan Pierce
Oh yeah. So those who are not have been using Reddit for 20 years. Old Reddit was the original Reddit change. Red.com is the new version which tries
00:44:25 - Anthony Campolo
to redirect you to the app and has like age restrictions. There's a lot of big differences.
00:44:30 - Dylan Pierce
Yeah. And I, I think from my deep dive before old Reddit.com has just an easier HTML structure to scrape the new one. You can. It's just not. It's using the Same kind of Twitter dynamic class names we saw if you guys choose to include that clip. But this will hopefully extract the text. But there's also cool like things you can do inside the HTML like grab upvotes. There's a straight up like upvotes span that gives you both down votes and upvotes. So dislikes zero upvotes is one or score is one. Yeah, that's pretty cool.
00:45:12 - Anthony Campolo
And then you just train more models with all your Reddit.
00:45:18 - Dylan Pierce
Yeah, exactly, exactly.
00:45:21 - Anthony Campolo
I've heard that Reddit data has been used to train things like GBT because it's such a massive source of unique human text on a variety like the widest possible breadth of subjects you could ever imagine.
00:45:33 - Dylan Pierce
Yeah, so it actually worked, guys.
00:45:35 - Scott Steinlage
That's awesome.
00:45:38 - Dylan Pierce
Yeah, so we don't. We didn't look up the puppeteer documentation once and we just sat here and just kind of played with prompts, looked at HTML and we grabbed comments and we could play with it more and grab the author name, grabbed a upvotes grab reply counts. There's all kinds of data to play with. But yeah, you can scrape Reddit in this way.
00:45:59 - Scott Steinlage
Super cool.
00:46:01 - Anthony Campolo
Yeah. And I imagine hacker news would be very similar because they're basically the same.
00:46:06 - Scott Steinlage
I wonder if you, you know, so you used a lot of language that would be known by a developer, but if you use the language that was more common just to the English language rather than developer language, would it still work?
00:46:21 - Dylan Pierce
Yeah, let's find out.
00:46:22 - Scott Steinlage
So I think a jet GBD would be smart enough to figure it out.
00:46:25 - Anthony Campolo
But that's the thing that I really blows my mind about ChatGPT is when I explain what I want to do. Yeah, I just, I literally explain exactly what I want to do and it then like goes from that to figure out what it wants. I find that sometimes I don't even need to specify like anything code related. It's right. Surely English. That's what I'm saying, describing it.
00:46:45 - Scott Steinlage
Don't say like, you know, we had
00:46:48 - Anthony Campolo
told it like grab the tweets in this. Like that was. You can actually get through the website. But no, something like that. Instead of trying to tell it.
00:46:55 - Scott Steinlage
So say it here.
00:46:56 - Anthony Campolo
I need to grab this HTML thing, right?
00:46:58 - Scott Steinlage
Yeah, grab the comments on this page here, you know, or something like that. And, and I don't even know if you would. Let's try without the class.
00:47:12 - Anthony Campolo
Yeah, just straight to the link. Yeah, yeah, basically. Can normal people do this?
00:47:16 - Scott Steinlage
That's what I'm saying.
00:47:19 - Dylan Pierce
I think because it does not have the context of the md, where it's at. Md? Yeah. It's describing the body. It's assuming that there is a. It's not even using the right. It's not even using the right package. It's using Axios.
00:47:34 - Scott Steinlage
Yeah, weird. So to make this even better for people who just wouldn't be dev people you could add into your prompts or have. Because you have like, what do you guys call those? Like the pre written processes or tasks that you can tell it to do or whatever. By. By. Without having to write any code. You just click it and it. You know, I can't remember what you guys call them.
00:48:06 - Dylan Pierce
Actions.
00:48:06 - Scott Steinlage
Actions, yeah. So like you have these pre written actions or whatever. Right. So like maybe have one that's for. That would be a lot though because there'd be so many. Every side's different. Right. So yeah, never mind.
00:48:18 - Anthony Campolo
I would imagine if you'd want some more like how you had the custom training from Puppeteer, you'd want to kind of train it on what it would get back from a certain website and how to understand it. You could probably do that. That'd be a lot interesting stuff to
00:48:32 - Scott Steinlage
do, but it would be. That's what I was saying. Yeah.
00:48:35 - Dylan Pierce
So I mean the way I think I envision this working is making this AI aware of data between steps. So let's like pretend we're getting all the content in one step and then the next step we would say, you know, we could just use regular node and say.
00:48:55 - Scott Steinlage
But you're. The other thing is like who's gonna. Who's not a developer that would know, oh let's go into node and start generating. You know what I'm saying? Like this is for developers. Right? Right.
00:49:04 - Dylan Pierce
Like for developers. That's how we're different from the competition.
00:49:07 - Scott Steinlage
Exactly.
00:49:07 - Dylan Pierce
Like. Like Zapier or others. We. We are.
00:49:11 - Scott Steinlage
That's your avatar.
00:49:12 - Dylan Pierce
Yeah. Like we are developer first. So we're going to 10x your productivity.
00:49:17 - Scott Steinlage
Right.
00:49:17 - Dylan Pierce
We're going to give you pre built actions too if you want to use them.
00:49:22 - Anthony Campolo
But what might actually be the missing bridge here is now that GBT4, you can just show it a picture. If you were to show it up the website.
00:49:32 - Scott Steinlage
That's what I was doing.
00:49:34 - Anthony Campolo
That might be all you need.
00:49:36 - Scott Steinlage
I did that with some things which is interesting.
00:49:38 - Dylan Pierce
It was able to infer the success class classes and stuff or find like markup. Oh you mean like actually image. Just take a screenshot. Yeah.
00:49:48 - Anthony Campolo
If it could look at the screenshot of the image and then you also let it call the endpoint to get the code, it could then correlate those two.
00:49:54 - Scott Steinlage
Yeah. And the other thing is if you like make it a. If you take the p. If you take a PDF of it. So like you have like a. Let's say it's a super long page because Reddit can go really long. Right. And so you take the whole page.
00:50:05 - Anthony Campolo
Right.
00:50:05 - Scott Steinlage
Or at least as much as you can get. And so that could be like 10 pages.
00:50:09 - Dylan Pierce
Right.
00:50:10 - Scott Steinlage
Or something. And then you take that PDF and then you export it to a PNG and then upload it to ChatGPT. It'll. It'll actually scroll through all that and then, you know, you can ask it to do anything from it.
00:50:24 - Dylan Pierce
That kind of obsolete web scraping. Yeah.
00:50:27 - Scott Steinlage
From an image standpoint. But there's.
00:50:29 - Anthony Campolo
I.
00:50:29 - Scott Steinlage
There's definitely fallbacks, I'm sure, but I just not quite sure what that would be yet. I haven't used it as much as
00:50:35 - Dylan Pierce
you'd probably have to take the manual effort to like give it at least one highlight and say this is a comment. Maybe not. It might be smart enough.
00:50:42 - Scott Steinlage
I think it. Yeah, I think you could read it.
00:50:45 - Dylan Pierce
That's crazy. Never thought about the implications of the image processing like that. I just assumed website generation. Like, here's my crappy hand. Hand drawing, please.
00:50:57 - Scott Steinlage
No, I used it in the way we just spoke of actually very similarly, at least for some. For a project I did.
00:51:02 - Dylan Pierce
That's cool, huh? I need to get access to that. I don't have access to it yet.
00:51:07 - Anthony Campolo
Yeah, I've only had like two weeks and the first week it was like pretty much broken. Yeah, it's. It's really, it. It's a whole different way of thinking. You don't have access to now you just show it. What was that?
00:51:19 - Scott Steinlage
I was saying to Dylan, are you sure you don't have access to it? Because if you just log into ChatGPT and you have Pro.
00:51:24 - Dylan Pierce
Right.
00:51:24 - Scott Steinlage
You pay 20 bucks a month or whatever. Like there's a little image on the bottom account.
00:51:28 - Anthony Campolo
That's the question.
00:51:29 - Scott Steinlage
There's a little image on the bottom personal account.
00:51:31 - Dylan Pierce
No. No.
00:51:32 - Scott Steinlage
Okay.
00:51:32 - Dylan Pierce
I end up using. The nice thing about Pipedream is I end up using, you know, like our, our paid account because it's so easy to just select the. The account, you know, that should be.
00:51:42 - Anthony Campolo
Be able to do images then.
00:51:45 - Dylan Pierce
Yeah. Like I just use the company account this way. Yeah.
00:51:47 - Scott Steinlage
And it should be able to use Pipedream to select an image and then put one in there. Yeah, I don't know exactly.
00:51:55 - Dylan Pierce
So we could automate that flow you just talked about. Use Puppeteer to make a screenshot.
00:52:00 - Scott Steinlage
Yes, Exactly.
00:52:01 - Dylan Pierce
Use an OpenAI action or upload the image. Upload the image and then prompt it.
00:52:07 - Scott Steinlage
Ask it to do whatever.
00:52:08 - Dylan Pierce
Yep. And then expose it as an API and charge 20 bucks a month.
00:52:13 - Scott Steinlage
Oh, my gosh.
00:52:13 - Dylan Pierce
Right?
00:52:15 - Scott Steinlage
And then everybody does it. And then, Then, Then people.
00:52:19 - Dylan Pierce
Yeah.
00:52:19 - Scott Steinlage
Fail. But yeah,
00:52:23 - Dylan Pierce
that's uni. Yeah.
00:52:25 - Scott Steinlage
No, open source it. Come on.
00:52:27 - Dylan Pierce
Of course, of course. Because now you can. Using, like I said with. With projects, you can connect this project to GitHub.
00:52:33 - Scott Steinlage
GitHub.
00:52:34 - Dylan Pierce
And open source it. You can just.
00:52:35 - Scott Steinlage
Boom.
00:52:36 - Dylan Pierce
Yep. And this will serialize the code into GitHub.
00:52:40 - Scott Steinlage
It is hacktober month, you know, so
00:52:43 - Dylan Pierce
I'm still rocking my old Hacktober Fest T shirt before the whole debacle with man, that was crazy. I forget, do they still give out T shirts or is that over, over now?
00:52:54 - Scott Steinlage
I don't know.
00:52:55 - Anthony Campolo
They stopped and they said it helped a lot in cutting down on, like, PR spam. People aren't just, like, trying to get people to merge PRs that aren't actually anything so you can get a T shirt. So it. I think it was probably the right move.
00:53:07 - Dylan Pierce
People go far lengths to get free swag. That's the lesson.
00:53:11 - Anthony Campolo
Yeah. So obnoxious.
00:53:13 - Dylan Pierce
Yeah. It was such a comfy T shirt. Yeah, it was a great T shirt. It was very comfy. Still. I still use it.
00:53:20 - Scott Steinlage
I love it when people do swag. Right. Because then I will use that shirt for a long time.
00:53:24 - Dylan Pierce
Right, Right.
00:53:25 - Anthony Campolo
Yep.
00:53:27 - Dylan Pierce
There's the next contest. You have to use Pipedream to
00:53:31 - Scott Steinlage
create a T shirt.
00:53:32 - Dylan Pierce
Order the T shirt, order the T shirt design, and order the T shirt. We'll give you an API endpoint, and you have to. You should train to do that.
00:53:39 - Scott Steinlage
You know what? You guys should have, like, some competitions or something for, like, doing silly things. Like, whoever can do this the most unique way or something. I don't know, you know?
00:53:49 - Dylan Pierce
Yeah, yeah, we've been talking about it. It'll definitely generate some.
00:53:52 - Scott Steinlage
Some. Yeah, or whatever. Yeah, it would generate some stuff, I'm sure.
00:53:56 - Dylan Pierce
Yeah, we've. We thought about doing an AI competition or hackathon. Yeah.
00:54:01 - Scott Steinlage
Yeah.
00:54:01 - Dylan Pierce
And like a rube. Like, there would be like, a Rube Goldberg kind of competition who can make the most ridiculous workflow that does basically nothing or calls itself or what have you. Yeah, but that's. That's definitely on our. On our radar. Um, yeah, so that. That's kind of like just the overview of the puppeteer plus AI, which I think is really interesting combination and this is just one of many features that you can do.
00:54:31 - Scott Steinlage
That's awesome, man. Thanks so much for sharing with us. This is really cool. I love seeing what Pipedream has because you guys always have some really cool stuff that's so usable. And not just usable, but good for your daily life of being a developer and, or just getting into doing things. If you haven't checked out Pipedream, you all should go check it out. It's free, you know, to start with. And, and you know, I'd highly suggest getting in there and messing with it. If you, if you've ever messed with like Zapier or Zapier, however you want to pronounce it and you're a developer, like you're silly. Just go to Pipedream.
00:55:09 - Dylan Pierce
It's.
00:55:10 - Scott Steinlage
It's the same thing, but better.
00:55:11 - Dylan Pierce
So much more of that one-off task where you're just like, I really don't want to open up yet another Node script. Do it in Pipedream. Just trust me.
00:55:22 - Scott Steinlage
Yeah, totally, absolutely.
00:55:24 - Anthony Campolo
Are there any exciting features on the horizon that people should look forward to?
00:55:31 - Dylan Pierce
What can I talk about? What can I talk about?
00:55:33 - Anthony Campolo
And if you have anything specific, I'll be curious, like generally how you see AI continuing to like expand within these kind of tools.
00:55:41 - Scott Steinlage
We love AI, so.
00:55:43 - Dylan Pierce
Yeah, yeah, so the, the AI, we're constantly improving the AI. I would say we're brainstorming ways like to make the AI more aware of your workflow so you can do things like base the prompting off of data and other parts of your steps and train it better with embeddings based on the API docs of the service that you're asking about. Yeah, that's one of the things we've been researching because right now it just uses the app name itself. So Puppeteer is a well known, great documentation for a long time, but we need to feed it the updated documentation embedding.
00:56:30 - Scott Steinlage
Yeah, that's interesting. That's very interesting that you said that because that was something when I was messing with your guys'. When I was messing with the
00:56:44 - Dylan Pierce
AI
00:56:44 - Scott Steinlage
piece that you guys had
00:56:48 - Dylan Pierce
about PI. The bot thing.
00:56:50 - Scott Steinlage
No, not on Slack, but inside of here. When you first like launched the AI piece of, of this, that used ChatGPT inside of the Node.js.
00:57:03 - Dylan Pierce
Okay, Code generation.
00:57:04 - Scott Steinlage
Yeah, code generation piece. Yeah, yeah, yeah, yeah. You know, it was, it was not getting the latest documentation. For example, I was trying to do something with Twitter and before they changed everything and you know, it was like getting the old documentation from 2021 and they had just changed it to several other things. So anyway, it was interesting. But yeah, having that would be huge. Game changer for sure.
00:57:31 - Dylan Pierce
Yeah. Yeah, definitely. I wish. Oh. Oh. The big thing that we. We are working towards this is our big rock is to introduce branching and looping, which sounds. It's like it. I can't believe this product's been around for six years now. It doesn't have it because I just haven't needed to use it that frequently. But an example of how this could be useful is for like Stripe webhooks. You know how you can make like one endpoint and subscribe to many events if you could branch that one webhook URL into many workflows.
00:58:05 - Scott Steinlage
Oh, cool. So if this happens, do all these different things. Yeah, yeah.
00:58:08 - Dylan Pierce
Like almost like a router. Like you have one endpoint and then you can use a branch that'll trigger
00:58:14 - Scott Steinlage
so many different things.
00:58:15 - Dylan Pierce
Right.
00:58:16 - Scott Steinlage
And that doesn't happen right now.
00:58:17 - Anthony Campolo
Really?
00:58:17 - Scott Steinlage
I didn't realize that.
00:58:18 - Dylan Pierce
Yeah. You have to use conditionals in Node.js or Python in order to act on things conditionally. Right? Yeah. So we are working toward that. I'm actually going to release some documentation next week on how to trigger a workflow programmatically. But the UI for having a branch is yet to come. Yeah. And looping, because right now if you have to loop over a large amount of records, say you pull from Supabase or Airtable or whatever, you have this large amount of records and you're trying to do an async op on each one. You're hitting the 750-second Lambda limit. The way around it is to loop within Node.js and hit another workflow. So that way you have a processing workflow and an iteration workflow. And we're working on getting that native so you don't have to worry about it. Under the hood it'll make new workflows and kind of hide that from you. So those are the two big things that customers have been asking for for a long time. I've run into them myself, and I have workarounds like I just described.
00:59:25 - Scott Steinlage
Right.
00:59:26 - Dylan Pierce
But I'm really excited about that.
00:59:28 - Anthony Campolo
Pinging it.
00:59:28 - Dylan Pierce
Much easier.
00:59:29 - Scott Steinlage
Yeah. That's cool. Awesome, man. Awesome. What. What about any other. Do you guys ever go conferences or anything like that? Is that something you guys do or do you like a physical appearance or. No, I don't know.
00:59:43 - Dylan Pierce
Not yet.
00:59:45 - Scott Steinlage
I just didn't know.
00:59:46 - Dylan Pierce
Yeah, yeah, yeah, yeah.
00:59:47 - Anthony Campolo
You think you will?
00:59:49 - Dylan Pierce
I would like to one day. We are still an incredibly small team. I'm shocked at how big this product is and why it is and complicated as it is. And there's really only like 10 of us, so it's still new. Like we're still, we're still in bootstrap like mode.
01:00:09 - Anthony Campolo
You got a lot of users though.
01:00:11 - Dylan Pierce
Yeah, a lot of users. We have a lot of users. We have a lot of competition as well.
01:00:17 - Anthony Campolo
I'll be curious. I know this is always difficult to talk about, but who do you see as your main competitors?
01:00:25 - Dylan Pierce
It's kind of funny the way I see it. There's really not any right now because if you look at existing workflow tools, they're all tailored to no coders and if they are tailored to coders, they don't have the integration creations that we do. Like we, we handle, we have essentially a combination of Mango. Do you guys know Nango? It's like an OAuth authentication manager.
01:00:57 - Anthony Campolo
Anything about OAuth I've managed to avoid that thankfully.
01:01:00 - Dylan Pierce
Exactly. Like no one wants to do it.
01:01:02 - Anthony Campolo
Yeah, but you said you had a lot of competitors though, so it's a, is, it's a large, there's a lot of people going out in the space. Sounds like it's what you're saying. Even if they do it differently from.
01:01:11 - Dylan Pierce
They do it differently. Yeah, we're, we're definitely unique in that we handle OAuth, we handle account connection, we have the code first gab integration and we're developer focused. Whereas others only have one or two of those things at a time.
01:01:29 - Anthony Campolo
Like yeah, there's a dev focus I think is what really stands out.
01:01:34 - Dylan Pierce
Yeah, there's other developer focused easy serverless tools but they don't have the ease of use of connecting accounts or like the library of literally thousands of pre built actions and triggers.
01:01:46 - Anthony Campolo
Like yeah, there's like you could run JavaScript code. It's like okay, cool. I could have done that with the Node server also.
01:01:53 - Dylan Pierce
Exactly. Like you can spin up a Lambda easily under the hood, but we have a layer on top that I think is so much more fine-tuned because it's been around much longer. Like the newer ones you mentioned, like Val Town, they're less than two years old. We're six years old.
01:02:10 - Anthony Campolo
So yeah, they're more like just giving you like these sandboxes to run code and then they're starting to build out libraries I think to connect things. So I think that's also interesting. And you know everyone has different ideas of what they want to stitch together and uses different tools. So some people will want, you know, like sheets the lobby, like bringing a spreadsheet. And some people, like, want to build discord bots. And I think that's great because this stuff is so complicated. The more companies that provide tools to make it easier, the better.
01:02:40 - Dylan Pierce
Yeah, exactly. It's kind of like the unbundling of Rails, where now you have systems that do auth for you. Clerk.
01:02:50 - Anthony Campolo
Yeah.
01:02:51 - Dylan Pierce
Background job services like QStash, Redis, Upstash. This is your integrations code. You don't have to worry about writing it yourself anymore, or at least not a lot of it. And it's hosted for you. I'm a big believer that DevOps should be hidden from you. Just be productive. You don't need to worry about DevOps. Just deliver value. Deliver solutions, not code.
01:03:20 - Anthony Campolo
Yeah, fire your platform team, you all.
01:03:23 - Scott Steinlage
All.
01:03:24 - Dylan Pierce
Yeah, there will be, like, very highly paid platform people. Kind of like how AWS sucked up all the infrastructure engineers that set up rack servers and stuff.
01:03:38 - Anthony Campolo
So where should people go online to find out more about Pipedream or to find you?
01:03:43 - Dylan Pierce
Yeah, definitely head over to pipedream.com we have a really awesome community. And of course, there's an AI bot in there. It helps you generate code as well. And we're also happy to answer questions there. You can find me on X or Twitter, whatever you call it at Control Alt. Dylan and I post frequently to our blog and yeah, thanks for having me on, guys. It was really fun and I'm sorry that Twitter decided to block us. That would be really cool, but I'm glad that Reddit's still pretty open. That's pretty neat.
01:04:17 - Scott Steinlage
Yeah, no, that was cool. Awesome. Well, thank you so much, Dylan. Greatly appreciate your time today. Really glad that you came on with us and accepted the the invite. So, yeah, remember, if you haven't checked out Pipedream yet, go check it out and hit up your man Dylan if you have any questions or just whatever. Yeah, awesome. Thank you so much. Greatly appreciate it. Anthony, anything else?
01:04:46 - Anthony Campolo
Nope. I hope people check this out and find something useful to do with it.
01:04:50 - Scott Steinlage
Totally. All right, thanks. Appreciate you all.
01:04:52 - Dylan Pierce
All.
01:04:52 - Scott Steinlage
We'll see you in the next one.
01:04:54 - Anthony Campolo
Next one. Peace.