How to Cancel Data Sync Runs Mid-Flight

How to Cancel Data Sync Runs Mid-Flight

How to Cancel Data Sync Runs Mid-Flight

Photo of Utku Zihnioglu

Utku Zihnioglu

CEO & Co-founder

You hit run on a sync at 4:47 PM. By 4:51 you realize the field mapping is wrong: every contact in HubSpot is having their email overwritten with their company domain. The run shows 8,400 of 80,000 records processed. You need to cancel data sync runs like this immediately. The pause button is greyed out. The cancel link in the docs talks about "stopping the next scheduled run." There is no next scheduled run. There is only this run, currently destroying your CRM.

This is the gap that separates production sync tools from glorified ETL scripts. The ability to cancel data sync runs mid-flight is not a nice-to-have. It is the first thing you need when something goes wrong, and it is the thing most tools quietly do not offer.

Why runaway sync jobs happen and what they cost

The most common causes of a runaway run are boring and avoidable in hindsight:

  • Wrong field mapping. You mapped company.domain to contact.email instead of contact.company_domain. Every contact gets the company domain as their email.

  • Missing filter. A test sync was supposed to write only "test_*" tagged records. The filter was never saved. The run is now writing all 80,000 records.

  • Schema drift. A column you depended on was renamed at the source. The sync is sending nulls for a required field. The destination is overwriting real data with empty values.

  • Accidental full re-sync. Someone toggled "resync historical data" thinking it would be a no-op. It is now scanning every record from 2019 forward.

  • Stale primary key. The mapping pointed at the wrong unique identifier. The destination is creating duplicates instead of updating, doubling your contact list.

The cost compounds with every record processed. A bad write to a CRM does not just damage a row; it triggers downstream automations. Marketing emails fire. Support workflows route to the wrong owner. Analytics dashboards capture the corrupted state in their nightly snapshot. By the time you notice, the blast radius is wider than the original sync.

If the sync runs to completion, recovery becomes an archaeology project. You restore from backup, hope the backup is recent, and then manually reconcile every change made by humans during the bad window. If you can stop the sync at 8,400 of 80,000, you have to fix 8,400 records instead of 80,000. The math on cancellation latency is brutal: every minute you cannot stop the run is roughly 1,000 records of cleanup work.

How to spot a stuck sync run before it finishes

Detection is the prerequisite for cancellation. You cannot stop a sync if you do not know it is misbehaving. Three signals matter:

  1. Record count drift. Compare the live record count being processed against the source size. A sync writing more records than exist at the source is creating duplicates. A sync writing 5x the expected volume usually means a missing filter or wrong record type.

  2. Error rate spike. A healthy run has an error rate near zero. When the rate jumps to 5% or more, the destination is rejecting writes. The usual cause is a schema mismatch between what the sync is sending and what the destination expects. Each rejected write is also a clue about the next 75,000 it will try.

  3. Duration past p99. Every sync has a normal completion time. A run that has crossed 2x its p99 baseline and is still going is processing more records than usual, hitting rate limits, or stuck on retries. None of these end well.

Most production teams set alerts on the second and third signal but watch the first manually. The first signal is the strongest one and the one most worth automating. If you have ever had to explain to a CEO why every contact in HubSpot now has a @yourcompany.com email, you will appreciate why.

The reason this matters: by the time the error rate spikes, you have already done damage. Record count drift catches the runaway in the first 60 seconds, before the destination gets confused enough to reject anything. The sync is happily writing nonsense at full speed because the destination is accepting nonsense.

Graceful cancel data sync vs force cancel: when to use each

Once you decide to stop a run, you have two choices, and they are not interchangeable.

Graceful cancel lets every in-flight record finish, then stops picking up new ones. The worker drains its current batch, commits the writes, and exits. You lose nothing in flight, but the records that were already queued for this run will be processed.

Force cancel kills the worker immediately. Any record being written when you hit the button is left in an indeterminate state: partially written, with no commit and no rollback. Replay is required to heal the destination.

Situation

Use

Why

Wrong field mapping caught early

Graceful

Stop the bleeding; in-flight writes are minimal

Hit downstream rate limit

Graceful

In-flight requests are already paid for; let them complete

Sync running 5x normal duration

Graceful

Likely retries; let current attempt resolve

Schema drift, nulls overwriting real data

Force

Every additional write is destructive

Accidental full resync of historical data

Force

Stop scanning before another batch loads

Bad mapping corrupting records on every write

Force

Do not let one more bad write happen

The default should be graceful. You only reach for force when the act of completing in-flight work is itself the problem. A sync that is rate-limiting your transactional email provider, for example, is doing damage with every successful write, not every failed one.

A subtle point most operators miss: force cancel only helps if your destination writes are idempotent. If the sync writes "subtract 1 from balance" instead of "set balance to 42," replaying after a force cancel will double-decrement. Sync tools that promise safe force cancel are quietly assuming idempotent writes underneath. Most modern integrations are idempotent on contact and account records. They are often not idempotent on event records or counters.

What to check after you cancel a data sync run

Cancellation is the start of recovery, not the end. The list of things to check, in order:

  1. Pull the list of touched records. Most sync tools log every record write at the run level. Export that list before anything else. It is the audit trail for what needs fixing. If your tool does not log per-record writes, you are flying blind and the rest of these steps are guesswork.

  2. Diff touched records against the source. Run a query that joins the touched record IDs against the current source state. Anywhere the destination value differs from the source value is where the bad mapping landed.

  3. Decide replay or rollback. If you have field-level history at the destination, rollback is fast. If not, replay with the corrected mapping is your safer path. Replay assumes the source still has the right values; rollback assumes you have a clean snapshot from before the bad run.

  4. Pause the sync schedule. Do this before you start fixing the mapping. The next scheduled run will trigger the same bug if you forget.

  5. Fix the mapping or filter. Whatever caused the runaway should not be possible to recreate by accident. If the bug was a missing filter, save the filter as a default. If it was a wrong field reference, add a guard rule.

  6. Resume with a small dry run. Before re-enabling the full schedule, run the sync against a 100-record sample. Verify the mapping is right. Then re-enable.

Skipping step 4 is the second most common way to make things worse. The mapping bug that caused the runaway is still in the config until you fix it; the schedule will fire again in 15 minutes and start the second runaway before you have finished cleaning up the first.

A practical recovery shortcut, if your sync tool tracks per-field writes: filter the record list by the specific field that was being written incorrectly. You only need to repair that field, not every property on every touched record. Property-level history turns "rebuild 8,400 contacts" into "reset one column on 8,400 contacts." That is often a 10-minute job instead of a half-day project.

Why most sync tools won't let you cancel data sync runs mid-flight

The architectural reason is simple. ETL tools designed for warehouse loading were built around the assumption that runs are atomic: they either complete or fail, and the destination tolerates retries. Mid-run cancellation was never a design requirement because Snowflake does not care if you interrupt a load and replay it tomorrow.

Operational sync tools inherited that warehouse architecture. They write into your CRM, your support platform, and your billing system, but the run model is the same as a warehouse load. The cancel button often does not exist, or it exists in the API but not the UI, or it cancels the next scheduled run instead of the current one. We have seen tools where the supported way to stop data sync jobs was to revoke the API token used by the sync, which crashes the worker mid-write and leaves the destination in worse shape than letting it finish.

A few practical things to check in any sync tool you depend on:

  • Does the runs view show an active, in-flight run, or only sync run history after the fact?

  • Does the cancel button work on a currently running job, or only the schedule?

  • Does the tool distinguish graceful from force, or does it pick one for you?

  • Does the run log per-record writes so you can audit what happened?

  • Can you pause the schedule independently of canceling the active run?

  • Can you cancel an ETL job that has already started, or only one that is queued?

If the answer to any of these is "no" or "I don't know," you have a gap. Most teams discover the gap during their first runaway run, which is a bad time to learn that there is no cancel button.

The deeper point is that mid-flight cancellation requires real-time orchestration: a workflow engine that can signal a running worker to drain or terminate, and a worker process that can respond to that signal without corrupting its current write. Most ETL tools do not have that. The newer wave of sync platforms is built on real workflow engines instead of cron-triggered scripts, and that architectural shift is what makes mid-run cancellation possible. We built Oneprofile's runs dashboard around exactly this need: every active run is visible, every run can be canceled in either mode, and every record written by the run is logged so you can audit and replay. Production sync is not just about getting data through. It is about being able to stop the data when something is wrong.

If you have ever stared at a 4:51 PM email overwrite in progress with no cancel button, you already know why this matters. If you haven't yet, you will. Plan for the cancel before you need it.

Can you cancel a data sync that is already running?

What is the difference between graceful and force cancel?

How do I tell if a sync is running away?

What should I do after I cancel a sync run?

Will canceling an ETL job corrupt my destination?

Ready to get started?

No credit card required

Free 100k syncs every month