Get all tables used in join from SQLAlchemy
How to get all the tables present in the join from an SQLAlchemy session query join statement
Recently, I was working on one of my projects and I came across this issue of joining two tables. Now the thing is, I mostly stay away from using SQL and prefer to go with NoSQL databases, however this time, the requirement was to go with PostgreSQL so I ended up setting it up. Since the language I used for building backend services is Python (mostly
FastAPI), I went with
SQLAlchemy in order to connect to the database and get the proper data.
I actually think that it is a great library to connect to the database. I understand it's a bit of a hassle to set it up at the beginning but that set up actually helps a lot. Since, you set up a proper types etc for each column, it's easier to have proper type checking and avoid errors in the long run.
Anyway, to be honest, I am kind of liking using PostgreSQL and since it is
the language a lot of the Python developers use so I thought it will be worth trying it out. Now, I don't want to be hating on MySQL but honestly I feel like it is way too old school.
Well, my problem was I needed two tables. One would be a child table and the parent would keep a reference to the child. However, when I make a query I wanted to join the table and get an unified output.
So, I created the two tables in the following way.
Create the child table:
create table child (cid serial primary key, child_name varchar not null);
Create the parent table:
create table parent (pid serial primary key, parent_name varchar not null, child serial);
Once both the tables are created, I added a reference to the child id in the child column for the parent table.
alter table parent add foreign key (child) references child (cid);
Now, let's enter some sample data in the tables and try to fetch them the way we want it.
After entering the data, the tables looked like this
test=# select * from parent; pid | parent_name | child -----+-------------+------- 1 | Test Parent | 2 (1 row)
test=# select * from child; cid | child_name -----+------------ 2 | Test Child (1 row)
Now, as seen above, the parent with id
1 is referencing a child with id
2. If we want to fetch both the above data together, we would do a join. For example, the tables can be joined in the following way.
test=# select * from parent left join child on parent.child = child.cid; pid | parent_name | child | cid | child_name -----+-------------+-------+-----+------------ 1 | Test Parent | 2 | 2 | Test Child (1 row)
When we join the tables using a left join on the child id, we get the above output, which is just perfect. However, this is just the command on the SQL console. We need to do the same command in SQLAlchemy and get the same data so that it will be easier for us.
I tried joining the tables as pointed by a lot of StackOverflow answers. However, that just didn't work.
Now the answer basically says that we can do a join like in the following way:
Assuming you created a Session object from SQLAlchemy which would be named as
db.query(Parent).join(Child, Parent.child == Child.cid)
Anyone with a basic understanding of Python would think that the above result would return a Parent object with the Child attributes included since that is what we understand from the table outputs that SQL provides us.
Like, if we run a basic join command, like in the above part, we get one table with both the child and the parent rows included.
However, this is not the case in
SQLAlchemy. They actually work a bit different.
So, in SQLAlchemy, when the above statement is run, it returns a
Parent object with just the parent attributes. Now, I was expecting that they would provide us an object with both the attributes from
Child but they didn't.
Here's a logical solution as to why they do not do that. If we want an object with both the parent and child attributes, that would mean we would get an object of a Class that has both the
Parent class and the
Child class. In other words, if we do a join in
SQLAlchemy, they would have to create a new
Class that would inherit from all the table classes that are present in the join.
This behaviour, even though, it would make developers lives easier, would be pretty inefficient. Creating classes on the fly like that (as far as I understand) would be a total mess and would not be an optimized solution for a library that is used by so many people.
What is the solution you may ask. Well, it's not that difficult. Since, we understand the problem now which is that
SQLAlchemy cannot just create dynamic classes with columns from more than one table, why don't we pass it the tables we want it to work on? That way it will return us all the tables we need and run a proper join on them as well.
So, in our situation, we need two tables from
SQLAlchemy, one being the
Parent table and the other the
child table. So we can do something like the following:
parent, child = db.query(Parent, Child).join(Child, Parent.child == Child.cid)
The above statement returns two objects, one is a
Parent object with all the parent columns and the second a
Child object with all the child details. Thus we have all the data that we needed and we can just do whatever we want with this data now.