patternpythonsqlalchemyMinor

Efficient Pandas to MySQL "UPDATE... WHERE"

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

updatepandaswhereefficientmysql

Problem

I have a pandas DataFrame and a (MySQL) database with the same columns. The database is not managed by me.

I want to update the values in the database in an "UPDATE... WHERE" style, updating only some columns wherever some other columns match.

Here's my code:

import sqlalchemy as sqla

def save_to_db(final_df, passwd):

    engine_str = 'mysql+mysqldb://username:{}@localhost/mydb'.format(passwd)

    engine = sqla.create_engine(engine_str)

    sm = sessionmaker(bind=engine)
    session = sm()

    metadata = sqla.MetaData(bind=engine)

    datatable = sqla.Table('AdcsLogForProduct', metadata, autoload=True)

    for ind, row in final_df.iterrows():
        u = sqla.sql.update(datatable) \
            .values({"q_ECI_B_x": row.q_ECI_B_x,
                     "q_ECI_B_y": row.q_ECI_B_y,
                     "q_ECI_B_z": row.q_ECI_B_z,
                     "q_ECI_B_s": row.q_ECI_B_s}) \
            .where(sqla.and_(datatable.c.year == row.year,
                             datatable.c.month == row.month,
                             datatable.c.day == row.day,
                             datatable.c.hours == row.hours,
                             datatable.c.minutes == row.minutes,
                             datatable.c.seconds == row.seconds,
                             datatable.c.milliseconds == row.milliseconds,
                             datatable.c.microseconds == row.microseconds))

        session.execute(u)

    session.flush()
    session.commit()

I'm doing this with plain sqlalchemy because apparently pandas' built-in SQL functions can't handle "UPDATE... WHERE" scenarios. However, this is really slow.

Isn't there a more efficient way to do this?

Solution

You have eight conditions to match for every UPDATE. A typical solution would store timestamps using a DATETIME or TIMESTAMP column, so that there is only one value to match.

For reasonable performance, ensure that the timestamp field is indexed.

Context

StackExchange Code Review Q#81795, answer score: 7

Revisions (0)

No revisions yet.