Skip to content

[Feature Request] - Parallel Fetching of Associated Fields in SELECT Mode #1018

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wangxing-git opened this issue May 5, 2025 · 2 comments
Labels
enhancement New feature or request pending Postpone plans because of more important tasks

Comments

@wangxing-git
Copy link

Reason

背景问题

当前 Jimmer 的抓取器(Fetcher)在抓取模式为 SELECT 时,对于关联字段的抓取是串行处理的。即便多个关联字段可以同时查询,它们仍然是逐个顺序触发的。这在高延迟数据库复杂对象树结构中,容易成为性能瓶颈。

然而,这些操作属于 IO 密集型任务,而非 CPU 密集,因此完全可以通过并行化显著提升吞吐量。


Problem

Currently, in SELECT mode, Jimmer's Fetcher loads associated fields sequentially, even if they are independent. These operations are IO-bound (not CPU-bound), and in many scenarios (e.g. high-latency DB, large object graphs), sequential loading becomes a performance bottleneck.

Description

功能提议

希望 Jimmer 能够:

  1. 对抓取器中多个关联字段的加载进行并行处理,而非串行;
  2. 支持通过线程池(Java)或协程(Kotlin)实现;
  3. 提供可配置的并行度(例如最大线程数、是否启用并行);
  4. 保证事务一致性以及异常传播的安全性。

使用示例

在 Java 中:

Fetcher<User> fetcher = USER_FETCHER
    .allScalarFields()
    .teams()
    .roles();

userRepository.findById(1L, fetcher);

目前 teams()roles() 是顺序执行的,如果能够并发加载,性能将显著提升。

理想配置方式示例:

JimmerParallelOptions options = JimmerParallelOptions.builder()
    .enableParallelAssociationFetching(true)
    .maxConcurrency(4)
    .executor(myCustomExecutor)
    .build();

Proposal

Allow parallel fetching of associated fields by:

  1. Enabling parallel execution of association fetching;
  2. Using thread pools (in Java) or coroutines (in Kotlin);
  3. Providing a way to configure concurrency (e.g. max thread count, executor, toggles);
  4. Ensuring safety for transaction boundaries, and exception handling.

Example

Fetcher<User> fetcher = USER_FETCHER
    .allScalarFields()
    .teams()
    .roles();

Currently, teams() and roles() are loaded one after another. Parallel execution could dramatically reduce response time.

Ideal Configuration Example

JimmerParallelOptions options = JimmerParallelOptions.builder()
    .enableParallelAssociationFetching(true)
    .maxConcurrency(4)
    .executor(myCustomExecutor)
    .build();

Existing solutions

No response

@wangxing-git wangxing-git added the enhancement New feature or request label May 5, 2025
@babyfish-ct
Copy link
Owner

Considered and discussed issues, this feature will not be considered before version 1.0.

  1. The query API has an overloaded version that explicitly specifies the JDBC Connection. For this overloaded version, concurrency is not allowed.
  2. For the overloaded version that does not specify a JDBC Connection, asynchronicity may lead to isolation-level issues. For example, there is a classic ORM scenario where a query is executed immediately after a modification within the same Spring transaction context. At this point, the transaction has not yet been committed, and the latest data cannot be queried on a brand-new JDBC Connection. Currently, Jimmer is in the 0.x phase, still in the promotion and development stage, with high-level users being the minority. If ordinary users encounter such issues, it would result in significant explanation costs and hinder promotion. Only when more experienced users support Jimmer and form a more vibrant community in the future will it be the right time to introduce such behavior.
  3. Due to the reasons discussed in point 2, currently, this kind of concurrent loading is more commonly adopted by solutions like GraphQL DataLoader. At present, most ORM frameworks still rely on Transaction AOP from frameworks like Spring to ensure operations are executed on the expected JDBC Connection, guaranteeing that developers always get the expected results. This is the most important thing. In the future, even if Jimmer 2.0 considers this feature, it will need to be explicitly enabled, not the default behavior. This is because developers need to understand that the classic transaction management mechanism will no longer be effective, and unexpected data might be queried. Users must be aware of all risks.
  4. Currently, the asynchronous concurrency mechanisms in the Java ecosystem are too chaotic, with traditional connection pools and Kotlin coroutines. However, I believe virtual threads—especially after Java 24, where synchronized is improved—are the future and should become the unified standard. Traditional connection pools and Kotlin coroutines should fade away. The world only needs smarter synchronous blocking (as simple as in the Golang world) and does not need asynchronous callbacks (even if they are as well-hidden as Kotlin coroutines). Therefore, I plan to support such functionality in a unified manner only when the entire JVM ecosystem better supports virtual threads.
  5. Currently, Jimmer 1.0 has a large set of core ORM capabilities that still need to be supplemented. Before version 1.0, no major refactoring of non-core capabilities will be scheduled.

@wangxing-git
Copy link
Author

@babyfish-ct Thank you for the detailed explanation and thoughtful considerations. I fully understand the concerns regarding version planning, transaction safety, and the potential learning curve for new users.

In traditional ORM frameworks, it's quite common for developers to manually load associated entities, and in many of our projects, we are already used to doing this concurrently—especially in read-only query scenarios that do not rely on Spring's transaction mechanism.

What makes this request relevant to Jimmer is precisely because of its elegant Fetcher design. If I want to parallelize data loading, I would have to bypass the Fetcher and manually assemble DTOs with concurrent calls. This adds extra complexity and diminishes the benefits that Fetcher offers in terms of clarity and maintainability.

Of course, I understand your focus on solidifying core features before 1.0. I just hope that in the future, you might consider offering some form of explicit opt-in API for enabling concurrent loading—at least for advanced users who are working in controlled, non-transactional contexts. Even a low-level API would be extremely helpful in such cases.

Thanks again for the open discussion and your vision for Jimmer. I'm very much looking forward to seeing what the framework evolves into after 1.0!

@babyfish-ct babyfish-ct added the pending Postpone plans because of more important tasks label May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pending Postpone plans because of more important tasks
Projects
None yet
Development

No branches or pull requests

2 participants