研究業績リスト
ジャーナル論文 - rm_published_papers: Scientific Journal
公開済 12/2025
Geriatrics
ジャーナル論文 - rm_published_papers: International Conference Proceedings
KuzushijiGen: A Real-Time Few-Shot Japanese Kuzushiji Generator via Differentiable Rendering
公開済 12/2025
Proc. of ACM Multimedia Asia
ジャーナル論文 - rm_published_papers: Scientific Journal
Entropy-Guided Search Space Optimization for Efficient Neural Network Pruning
公開済 11/2025
Algorithms
ジャーナル論文 - rm_published_papers: Scientific Journal
PosBridge: Multi-View Positional Embedding Transplant for Identity-Aware Image E
公開済 11/2025
Proc. of British Machine Vision Conference (BMVC)
ジャーナル論文 - rm_published_papers: International Conference Proceedings
Mobile Food Calorie Estimation Using Smartphone LiDAR Sensor
公開済 11/2025
Proc. of Asian Conference on Pattern Recognition (ACPR)
ジャーナル論文 - rm_published_papers: International Conference Proceedings
Decoupled Clip and Dynamic Sampling Policy Optimization for Food Reasoning Segmentation
公開済 11/2025
Proc. of ACMMM Workshop on MetaFood
ジャーナル論文 - rm_published_papers: International Conference Proceedings
Diffusion-Guided 3D-Aware Calorie Estimation from a Single Food Image
公開済 11/2025
Proc. of ACMMM Workshop on MetaFood
ジャーナル論文 - rm_misc: Others
SceneTextStylizer: A Training-Free Scene Text Style Transfer Framework with Diffusion Model
公開済 13/10/2025
With the rapid development of diffusion models, style transfer has made
remarkable progress. However, flexible and localized style editing for scene
text remains an unsolved challenge. Although existing scene text editing
methods have achieved text region editing, they are typically limited to
content replacement and simple styles, which lack the ability of free-style
transfer. In this paper, we introduce SceneTextStylizer, a novel training-free
diffusion-based framework for flexible and high-fidelity style transfer of text
in scene images. Unlike prior approaches that either perform global style
transfer or focus solely on textual content modification, our method enables
prompt-guided style transformation specifically for text regions, while
preserving both text readability and stylistic consistency. To achieve this, we
design a feature injection module that leverages diffusion model inversion and
self-attention to transfer style features effectively. Additionally, a region
control mechanism is introduced by applying a distance-based changing mask at
each denoising step, enabling precise spatial control. To further enhance
visual quality, we incorporate a style enhancement module based on the Fourier
transform to reinforce stylistic richness. Extensive experiments demonstrate
that our method achieves superior performance in scene text style
transformation, outperforming existing state-of-the-art methods in both visual
fidelity and text preservation.
ジャーナル論文 - rm_misc: Others
公開済 07/10/2025
Recent Human-object interaction detection (HOID) methods highly require prior
knowledge from VLMs to enhance the interaction recognition capabilities. The
training strategies and model architectures for connecting the knowledge from
VLMs to the HOI instance representations from the object detector are
challenging, and the whole framework is complex for further development or
application. On the other hand, the inherent reasoning abilities of MLLMs on
human-object interaction detection are under-explored. Inspired by the recent
success of training MLLMs with reinforcement learning (RL) methods, we propose
HOI-R1 and first explore the potential of the language model on the HOID task
without any additional detection modules. We introduce an HOI reasoning process
and HOID reward functions to solve the HOID task by pure text. The results on
the HICO-DET dataset show that HOI-R1 achieves 2x the accuracy of the baseline
with great generalization ability. The source code is available at
https://github.com/cjw2021/HOI-R1.
ジャーナル論文 - rm_published_papers: Scientific Journal
公開済 10/2025
European Journal of Investigation in Health, Psychology and Education