When trying to upgrade a self-hosted Postgres-backed PostHog instance, we have seen some users have issues with long-running migrations on tables in the hot path.
For example, when trying to migrate the persons table, some users have reported that a lock is never acquired, making the migration hang.
This usually happens because some analytics queries are taking too long to complete.
To get rid of these queries and run the migration, you can use the script below on the node where the PostHog Django server (web) is running.
⚠️ Warning! Proceed with caution. The following script has potentially destructive behavior.
Python script
# How to run:# Add this code to a delete_queries.py file# Run python manage.py shell < delete_queries.py# Set DRY_RUN=0 if you're sure you know what you're doingimport osfrom time import sleepfrom django.db import connectionQUERY_TIME_LIMIT = os.environ.get("QUERY_TIME_LIMIT", "5 minutes")TARGET_TABLE = os.environ.get("TARGET_TABLE", "posthog_person")DRY_RUN = os.environ.get("DRY_RUN", "1") == "1"MAX_TRIES = 10def fetch_open_queries():res = []with connection.cursor() as cursor:cursor.execute(f"""SELECTpid,now() - pg_stat_activity.query_start AS duration,queryFROM pg_stat_activityWHERE (now() - pg_stat_activity.query_start) > interval '{QUERY_TIME_LIMIT}'AND state = 'active'""")res = cursor.fetchall()return resopen_queries = fetch_open_queries()tries = 0while open_queries and len(open_queries) > 0 and tries < MAX_TRIES:for query_details in open_queries:if TARGET_TABLE in query_details[2]:print(f"Preparing to kill query {query_details[0]}:")print(query_details[2])print(f"It has been active for {query_details[1].seconds}s")if not DRY_RUN:with connection.cursor() as cursor:cursor.execute(f"SELECT pg_terminate_backend({query_details[0]})")print(f"Killed query {query_details[0]}")print("")open_queries = fetch_open_queries()tries += 1sleep(1)
I'd rather run the SQL myself
Instead of the script, you can also look for long running queries in Postgres yourself, like so:
SELECTpid,now() - pg_stat_activity.query_start AS duration,queryFROM pg_stat_activityWHERE (now() - pg_stat_activity.query_start) > interval '5 minutes'AND state = 'active'
Using the PIDs from the query above, you can then kill them like so:
SELECT pg_terminate_backend(pid)