A common criticism of JPA is that it generates inefficient SQL, often resulting in several queries where one would do. In particular, it’s prone to the n+1 problem where it loads an entity with one query and then runs an additional query for each item in an owned collection. JPA Entity Graph is one way to avoid this. It allows the application developer to closely control how related entities are fetched. You can use JPA Entity Graph to tell JPA to load a complex web of entities with a single query and significantly improve performance of your application.
Loading related entities
Here’s a deliberately complicated entity model:

Now let’s say I want to find all the movies reviewed by a reviewer. I also want to see related information: the movie’s studio, its stars and awards won by its stars. When I run the query, I can see the n+1 queries in the logs.
Hibernate: select r1_0.id,r1_0.name from reviewer r1_0 where r1_0.name=?
Hibernate: select r1_0.reviewer_id,r1_1.id,m1_0.id,s1_0.id,s1_0.name,m1_0.title,r2_0.id,r2_0.name,r1_1.wording from reviewer_reviews r1_0 join review r1_1 on r1_1.id=r1_0.reviews_id left join movie m1_0 on m1_0.id=r1_1.movie_id left join studio s1_0 on s1_0.id=m1_0.studio_id left join reviewer r2_0 on r2_0.id=r1_1.reviewer_id where r1_0.reviewer_id=?
Hibernate: select s1_0.movie_id,s1_1.id,s1_1.first_name,s1_1.last_name from movie_star s1_0 join star s1_1 on s1_1.id=s1_0.star_id where s1_0.movie_id=?
Hibernate: select s1_0.movie_id,s1_1.id,s1_1.first_name,s1_1.last_name from movie_star s1_0 join star s1_1 on s1_1.id=s1_0.star_id where s1_0.movie_id=?
Hibernate: select s1_0.movie_id,s1_1.id,s1_1.first_name,s1_1.last_name from movie_star s1_0 join star s1_1 on s1_1.id=s1_0.star_id where s1_0.movie_id=?
Hibernate: select a1_0.star_id,a1_1.id,ab1_0.id,ab1_0.award_name,ab1_0.organization,a1_1.category,r1_0.id,r1_0.first_name,r1_0.last_name from star_awards a1_0 join award a1_1 on a1_1.id=a1_0.awards_id left join award_body ab1_0 on ab1_0.id=a1_1.award_id left join star r1_0 on r1_0.id=a1_1.recipient_id where a1_0.star_id=?
Hibernate: select a1_0.star_id,a1_1.id,ab1_0.id,ab1_0.award_name,ab1_0.organization,a1_1.category,r1_0.id,r1_0.first_name,r1_0.last_name from star_awards a1_0 join award a1_1 on a1_1.id=a1_0.awards_id left join award_body ab1_0 on ab1_0.id=a1_1.award_id left join star r1_0 on r1_0.id=a1_1.recipient_id where a1_0.star_id=?
Hibernate: select a1_0.star_id,a1_1.id,ab1_0.id,ab1_0.award_name,ab1_0.organization,a1_1.category,r1_0.id,r1_0.first_name,r1_0.last_name from star_awards a1_0 join award a1_1 on a1_1.id=a1_0.awards_id left join award_body ab1_0 on ab1_0.id=a1_1.award_id left join star r1_0 on r1_0.id=a1_1.recipient_id where a1_0.star_id=?
Hibernate: select a1_0.star_id,a1_1.id,ab1_0.id,ab1_0.award_name,ab1_0.organization,a1_1.category,r1_0.id,r1_0.first_name,r1_0.last_name from star_awards a1_0 join award a1_1 on a1_1.id=a1_0.awards_id left join award_body ab1_0 on ab1_0.id=a1_1.award_id left join star r1_0 on r1_0.id=a1_1.recipient_id where a1_0.star_id=?
Hibernate: select a1_0.star_id,a1_1.id,ab1_0.id,ab1_0.award_name,ab1_0.organization,a1_1.category,r1_0.id,r1_0.first_name,r1_0.last_name from star_awards a1_0 join award a1_1 on a1_1.id=a1_0.awards_id left join award_body ab1_0 on ab1_0.id=a1_1.award_id left join star r1_0 on r1_0.id=a1_1.recipient_id where a1_0.star_id=?
Hibernate: select a1_0.star_id,a1_1.id,ab1_0.id,ab1_0.award_name,ab1_0.organization,a1_1.category,r1_0.id,r1_0.first_name,r1_0.last_name from star_awards a1_0 join award a1_1 on a1_1.id=a1_0.awards_id left join award_body ab1_0 on ab1_0.id=a1_1.award_id left join star r1_0 on r1_0.id=a1_1.recipient_id where a1_0.star_id=?
Hibernate: select a1_0.star_id,a1_1.id,ab1_0.id,ab1_0.award_name,ab1_0.organization,a1_1.category,r1_0.id,r1_0.first_name,r1_0.last_name from star_awards a1_0 join award a1_1 on a1_1.id=a1_0.awards_id left join award_body ab1_0 on ab1_0.id=a1_1.award_id left join star r1_0 on r1_0.id=a1_1.recipient_id where a1_0.star_id=?
I’m loading the reviewer in the first query. Then loading related movies and reviews. Then loading each star one by one. Then loading each award one by one. It took 12 queries to load all related entities for just this toy data set. If entities have large collections of related entities, this would be a massive performance problem.
A naive solution is to change all fetch types to EAGER. In my application, all @OneToMany and @ManyToMany relationships are FetchType.LAZY. This is often the best choice as it gives you flexibility to load relationships eagerly if you choose to. If you use FetchType.EAGER though, you don’t have that flexibility. You can’t switch to lazy load if your use case requires it and you may end up loading massive amounts of data you don’t need every time – also a performance problem.
So, I want to declare all relationships with FetchType.LAZY but I want to load them EAGER in this case. I also want JPA to JOIN FETCH related entities. That is, fetch them in a single SQL query with all related tables JOINed to return all required data. A previous post on JPA native queries demonstrates how to do this using JPQL. This has the disadvantage of requiring you to craft a query. The great thing about Spring Data JPA in particular is that it creates your queries for you.
I want eager loading at runtime, JOIN FETCH semantics and the simplicity of Spring Data JPA Query Method derived queries. This can be achieved by explicitly defining the fetch plan at runtime using JPA Entity Graph.
JPA Entity Graph
JPA Entity Graph allows you to control the fetch plan. That is, what related entities will be fetched along with the current entity.
My requirement is simple: load all related entities when I load a Reviewer entity. I’ve used the @NamedEntityGraph annotation on the Reviewer entity to define my JPA Entity Graph:
@NamedEntityGraph(
name = "Reviewer.allRelatedEntities",
attributeNodes = @NamedAttributeNode(value = "reviews", subgraph = "Review.allRelatedEntities"),
subgraphs = {
@NamedSubgraph(
name = "Review.allRelatedEntities",
attributeNodes = @NamedAttributeNode(value = "movie", subgraph = "Movie.allRelatedEntities")
),
@NamedSubgraph(
name = "Movie.allRelatedEntities",
attributeNodes = {
@NamedAttributeNode(value = "stars", subgraph = "Star.allRelatedEntities"),
@NamedAttributeNode(value = "studio")
}
),
@NamedSubgraph(
name = "Star.allRelatedEntities",
attributeNodes = @NamedAttributeNode(value = "awards", subgraph = "Award.allRelatedEntities")
),
@NamedSubgraph(
name = "Award.allRelatedEntities",
attributeNodes = @NamedAttributeNode(value = "awardBody")
)
}
)
@Entity
public class Reviewer {
...
}
This looks complicated so let’s step through it. First, my @NamedEntityGraph has a name: Reviewer.allRelatedEntities. The name is arbitrary. I choose to prefix it with the entity name but that’s just a convention and not required. It then contains one or more attributeNodes. These define what attributes of this entity should be fetched. In my case, Reviewer has a single related entity: a Set of Review entities named reviews.
@OneToMany(fetch = FetchType.LAZY)
private Set<Review> reviews = new HashSet<>();
I want to load all entities related to Review too, so I’ve defined a subgraph named Review.allRelatedEntities. This is defined in the subgraphs attribute. If you follow the attributeNodes and subgraphs, the JPA Entity Graph ends up including all entities reachable from Reviewer. That is Review, Movie, Studio, Star, Award and AwardBody.
I then use the JPA Entity Graph by referring to it by name using an annotation in my Repository class:
public interface ReviewerRepository extends CrudRepository<Reviewer, Long> {
@EntityGraph(value = "Reviewer.allRelatedEntities")
Reviewer findByName(String name);
}
How this affects queries and entity fetch
If you request a Reviewer using the annotated Repository method, JPA now runs the following query:
select r1_0.id,r1_0.name,r2_0.reviewer_id,r2_1.id,m1_0.id,s1_0.movie_id,s1_1.id,a1_0.star_id,a1_1.id,ab1_0.id,ab1_0.award_name,ab1_0.organization,a1_1.category,a1_1.recipient_id,s1_1.first_name,s1_1.last_name,s2_0.id,s2_0.name,m1_0.title,r2_1.reviewer_id,r2_1.wording
from reviewer r1_0
left join reviewer_reviews r2_0 on r1_0.id=r2_0.reviewer_id
left join review r2_1 on r2_1.id=r2_0.reviews_id
left join movie m1_0 on m1_0.id=r2_1.movie_id
left join movie_star s1_0 on m1_0.id=s1_0.movie_id
left join star s1_1 on s1_1.id=s1_0.star_id
left join star_awards a1_0 on s1_1.id=a1_0.star_id
left join award a1_1 on a1_1.id=a1_0.awards_id
left join award_body ab1_0 on ab1_0.id=a1_1.award_id
left join studio s2_0 on s2_0.id=m1_0.studio_id
where r1_0.name=?
That’s it. Just a single (admittedly complicated) query. This fetches attributes of all the entities referred to in the JPA Entity Graph and constructs the entity objects.
The advantage is that we’ve now loaded everything with a single trip to the database. The disadvantage is that the query now joins 10 tables so we’ve just pushed a heap of work from your application to the database. If you’re making changes like this, test the impact on performance! It may be that you were quicker making multiple trips to the database.
Very often though, this technique improves performance. You’ve pushed the effort to join tables / entities down to the database. Relational databases are specifically optimised for this. What’s more, they offer ways for the application developer to improve performance further with column indexes, query plan caching and optimiser hints. Usually it is better to perform this sort of data join operation in the relational database management system (RDBMS) than in application code.
The full source code of this demo application is in my GitHub repo.
Be First to Comment