If you’ve been working with Midjourney for any length of time, you’ve noticed that this amazing technology is missing one really important feature: an API.
There are hints one is coming (for one thing you can generate keys in your profile). But there’s no official Python library, npm module, or REST interface you can invoke to streamline your work. Instead you must laboriously…copy/paste…each…prompt…into the Discord /imagine interface. Ugh.
Of course, lack of official anything has never stopped the DIY nation from working around that limitation.
Is it possible to automate with Midjourney, even though there’s no official API? Yes! Let’s look at some methods.
Warning!
The Midjourney Terms of Service explicitly says:
You may not use automated tools to access, interact with, or generate Assets through the Services.
There’s a certain irony there.
Discord’s guidelines say:
Do not use self-bots or user-bots. Each account must be associated with a human, not a bot.
I’m not a lawyer so you’ll need to make your own decision about proceeding with these techniques.
Option #1: PyAutoGUI
We covered one method earlier, which uses PyAutoGUI to drive the browser robotically.
This method uses a GUI-driving Python script to emit commands to drive the Discord GUI. It’s a form of robotic process automation. Early modes of automating repetitive tasks focused on using APIs and scripts to step around human-oriented GUIs. RPAs use the same GUI as humans but have tooling to automate the interface. For example, RPAs can drive browsers, interpret results, branch, etc.
PyAutoGUI is pretty basic compared to enterprise-grade RPAs but it can get the job done.
Option #2: Microsoft Excel
Part of the usefulness of Midjourney is having it generate all possibilities. You’d never hire an artist to paint 50 paintings of the same thing just so you could pick just one, but that’s exactly what you can and frequently do with Midjourney.
If you’re just doing a few fun prompts now and then, that’s probably not a big deal. And some people want a step-by-step iterative feedback loop, where they tweak and perfect their prompt as they see how it returns.
But with Midjourney, often you will submit a few dozen prompts just to find that one you want to upsize or refine further. Very easy for a combinatorial explosion of prompts as try out all variations of words to describe the scene you want, and then also styles, perspectives, aspect ratios, models, etc.
Keeping track of all those variables and then generating the possible prompts is the job of Midjourney Prompter, a Microsoft Excel spreadsheet.
Option #3: Endless Midjourney
There is a Chrome extension called Endless Midjourney which allows you to drive Midjourney in your browser.
It requires you to connect to Discord in Chrome, and you’ll need to usual work of setting up your own Discord server and adding the Midjourney bot to it (see the official docs), which you’ve probably already done.
After adding the extension, login to Discord in your browser. Here’s what the plugin looks like:
After entering your prompts, it will generate, upscale, save, and even notify you on job completions by email.
The only downside to this extension is that it’s not free for serious use. There’s a free plan which limits you to 10 MJs a day, but most folks will be paying $9.99 for unlimited gens. See their web site for an account and full details.
Option #4: thenextleg.io
Here’s one I have not personally played with: thenextleg.io.
The Next Leg is a community extension of Midjourney that provides API access to the platform’s features and services. By integrating The Next Leg into your applications, you can leverage the power of Midjourney to enhance user experience of your own products.
You’ll use it to make REST calls (such as “POST https://api.thenextleg.io/v2/imagine”), and their API seems very extensive, even to the extent of automating banned word appeals. The docs have plenty of examples for both JavaScript and Python.
However, there’s a $40/month price tag:
Option #5: The Midjourney Automation Bot
Here’s now this bot describes itself on GitHub:
The Midjourney Automation Bot is a highly efficient Python-based automation program designed to generate and download unique images using the Midjourney bot on Discord. The script employs OpenAI’s GPT-3 to construct image prompts and Playwright, a Node.js library to control Chromium, Firefox, and WebKit browsers, to interact with the Discord application in a browser environment.
I have not tried it but last commit was less than 10 days ago so it’s still being actively worked on.
If this is the route you want to go, you might also find this article on Puppeteer interesting.
Related Posts:
- Virtvm: Crazy Wonderful Prices in Frankfurt, Germany: 12GB RAM for $3.99/Month! - September 11, 2024
- MariaDB Swallowed by Private Equity - September 10, 2024
- TORNADO ALERT: LuxVPS is Moving to a New DC and Has Deals! - September 9, 2024
It is exceptionally well-written and filled with insightful information.
Recently, I came across an amazing AI art generator called “Imagine.” This powerful tool takes text prompts and effortlessly transforms them into stunning works of art. If you’re intrigued by the idea, I highly recommend visiting their website to experience the magic of Imagine’s AI art generator firsthand. Simply follow this link to embark on a creative journey like no other. Imagine ai art generator