Best way to handle hitting Token Per Minute rate limits

MarcMPSGC · July 15, 2025, 1:20am

Hello,

I’m wondering if anyone else is having issues hitting tpm rate limits. Part of one of my flows requires passing quite a large number of tokens through an API call. Even after implementing some token efficiency processes, I still find myself hitting rate limits, specifically the tpm limits.

Looking forward, I intend on making my tool available for public use - when that happens, I need a way to mitigate hitting tmp rate limits - like some kind of queueing system. I haven’t found any node that could do the trick (Gumloop team, this could be really handy!). I may be able to utilize a Run Code node for this somehow - before I go down that road I’m wondering if anyone else has this same issue and any possible solutions.

thanks

Gumloop_Bot · July 15, 2025, 1:20am

Hey @MarcMPSGC! If you’re reporting an issue with a flow or an error in a run, please include the run link and make sure it’s shareable so we can take a look.

Find your run link on the history page. Format: https://www.gumloop.com/pipeline?run_id={{your_run_id}}&workbook_id={{workbook_id}}
Make it shareable by clicking “Share” → ‘Anyone with the link can view’ in the top-left corner of the flow screen.
Provide details about the issue—more context helps us troubleshoot faster.

You can find your run history here: https://www.gumloop.com/history

Wasay-Gumloop · July 15, 2025, 3:36am

Hey @MarcMPSGC – Just to clarify, are you using an Ask AI node with your own API key? And is that where you’re hitting the rate limit? If so, then yes, a run code node or a custom node where you can include a Python query to handle batching would be the best route.

OpenAI has some guides that are really helpful for this, and I’ll link one below in case you want to reference it when writing your Python code. They specifically recommend using exponential backoff or batching requests through their batch API.

Let me know if either of those options works for you or if you run into issues. You can also paste those OpenAI docs into an LLM to help generate the code for your run code node.

I agree that a batching node would be really useful, I’ll add that to the roadmap.

OpenAI Rate Limits Guide

MarcMPSGC · July 16, 2025, 7:46pm

Hi Wasay,

Yes, I’m using various Ask AI nodes with my own API keys. Predominantly using OpenAI API keys but also Anthropic, Grok and Perplexity.
Yes, this is where I’m hitting rate limit issues. Currently it’s the “token per minute” rate limit that is the issue.
Thanks for the resource! I’m going to look into Exponential Backoff.
I’ve come up with an interesting solution I’m going to try out that doesn’t require code >> Using Google Sheets Read/Write to create a “Queue” where a column acts as a Status that I can use to gate the flow. For example, I write to the sheet before the Ask AI node and give it a status of “Processing”. I use this to halt other Runs. Once the Ask AI node completes, I update the Status to “Complete” which allows the next Run to proceed.

I’ll let you know if I hit any snags.
Thanks

system · August 5, 2025, 7:46pm

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to delay parts of a flow Get Help Ask-AI	5	72	August 9, 2025
Receiving API Rate Limit For OpenAI Calls For Ask AI Nodes As "Starter" Get Help Ask-AI	3	50	May 14, 2025
OpenAI Rate Limit error for Ask AI node using Grok 4 Get Help Ask-AI	4	19	February 8, 2026
Hitting API Limits for OpenAI and tasks not running Bug Extract-Data	3	76	June 29, 2025
Rate Limit by Open AI - Why is that General Question Ask-AI	3	53	June 22, 2025

Best way to handle hitting Token Per Minute rate limits

Related topics