Publications
* denotes equal co-author
2025
-
Optimizing Autonomous Driving Datasets for Perception: Complexity, Quality, Uncertainty (under reviewed)Gongjin Lan*, Zexuan Jia*, and Qi HaoIn IEEE Transactions on Intelligent Transportation Systems, 2025Autonomous driving significantly benefits from data-driven deep neural networks that require large-scale and high-quality datasets for training. Several well-known datasets such as nuScenes, CADC, and SUSCape have been applied to train deep models for autonomous driving. However, these datasets generally have many redundant and low-quality frames that take extra training time and affect the performance of the trained models. In this work, we propose a novel three-layer method by considering scene complexity, instance quality, and sensing uncertainty to remove redundant data and maintain valuable data for optimizing autonomous driving datasets with smaller data sizes and equivalent effectiveness. We apply the method to optimize four well-known datasets nuScenes, CADC, SUSCape, and Carla-4Scenes, which are optimized with 76.5%, 71.4%, 76.5%, and 81.2% original size. Furthermore, we retrain the well-known perception algorithms on these optimized datasets. The experimental results show that these retrained perception algorithms on the optimized datasets perform equivalent even better accuracy with less training time. We conclude that our three-layer method successfully optimizes autonomous driving datasets with smaller sizes and equivalent effectiveness, which contributes to faster training with lower hardware requirements.
@inproceedings{lan2024optimizing, title = {Optimizing Autonomous Driving Datasets for Perception: Complexity, Quality, Uncertainty (under reviewed)}, author = {Lan*, Gongjin and Jia*, Zexuan and Hao, Qi}, booktitle = {IEEE Transactions on Intelligent Transportation Systems}, year = {2025}, } -
LLM-based and Game-Theoretic Traffic Trajectory Generation for Autonomous DrivingZexuan Jia, and Huayu XuIn IEEE International Conference on Computer Communication and Artificial Intelligence, 2025Autonomous driving significantly benefits from data-driven deep neural networks that require large-scale and high-quality datasets for training. To achieve higher performance or conduct critical testing, trajectories with game-theoretic properties are necessary. However, defining such scenarios and collecting large-scale data from reality or simulation is hard and costly. In this work, we propose a novel two-step method that generates traffic trajectories with game-theoretic properties, based on the Large Language Model (LLM). We apply the method to generate a challenging traffic trajectory dataset, which is validated on many downstream tasks, including perception and prediction. The experimental results show that the generated trajectories can achieve high-difficulty testing and advanced training. We conclude that our method successfully generates the challenging traffic trajectories, which contribute to collecting such critical data.
@inproceedings{jia2025llmbased, title = {LLM-based and Game-Theoretic Traffic Trajectory Generation for Autonomous Driving}, author = {Jia, Zexuan and Xu, Huayu}, booktitle = {IEEE International Conference on Computer Communication and Artificial Intelligence}, year = {2025}, }
2023
-
Active Data Acquisition in Autonomous Driving SimulationJianyu Lai, Zexuan Jia, and Boao LiIn Course Work, 2023Autonomous driving algorithms rely heavily on learning-based models, which require large datasets for training. However, there is often a large amount of redundant information in these datasets, while collecting and processing these datasets can be time-consuming and expensive. To address this issue, this paper proposes the concept of an active data-collecting strategy. For high-quality data, increasing the collection density can improve the overall quality of the dataset, ultimately achieving similar or even better results than the original dataset with lower labeling costs and smaller dataset sizes. In this paper, we design experiments to verify the quality of the collected dataset and to demonstrate this strategy can significantly reduce labeling costs and dataset size while improving the overall quality of the dataset, leading to better performance of autonomous driving systems.
@inproceedings{lai2023active, title = {Active Data Acquisition in Autonomous Driving Simulation}, author = {Lai, Jianyu and Jia, Zexuan and Li, Boao}, booktitle = {Course Work}, year = {2023}, }