Chenchen Ye Profile Banner
Chenchen Ye Profile
Chenchen Ye

@chenchenye_ccye

701
Followers
799
Following
10
Media
23
Statuses

CS PhD student @UCLA | Research Intern @Microsoft | Prev Undergrad @NUSingapore | LLM

Los Angeles, CA
Joined August 2022
Don't wanna be here? Send us removal request.
Pinned Tweet
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿ“ขNew LLM Agents Benchmark! Introducing ๐ŸŒŸMIRAI๐ŸŒŸ: A groundbreaking benchmark crafted for evaluating LLM agents in temporal forecasting of international events with tool use and complex reasoning! ๐Ÿ“œ Arxiv: ๐Ÿ”— Project page: ๐Ÿงต1/N
14
71
299
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต2/N We released our code, data and an iteractive demo: ๐Ÿ’ป GitHub Repo: ๐Ÿ“ Dataset: ๐Ÿ“Š Interactive Demo Notebook:
Tweet media one
1
1
13
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต11/N Sincere thanks to all amazing collaborators and advisors @acbuller , @Yihe__Deng , @HuangZi71008374 , @mingyu_ma , @Zhu_Yanqiao , and @WeiWang1973 for their invaluable advice and efforts! ๐Ÿ™โค๏ธ
0
0
11
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต 7/N Forecasting with Different Base LLMs 1๏ธโƒฃ ๐Ÿ“ˆ Code Block benefits stronger LLMs but hurts weaker models. 2๏ธโƒฃ ๐Ÿ†GPT-4o consistently outperforms other models. 3๏ธโƒฃ ๐Ÿ’ช Self-consistency makes a small model stronger.
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต10/N Check our paper out for more details! ๐ŸŒŸ Code error analysis, different event types, variation of API types, and different agent planning strategies! Join us in advancing the capabilities of LLM agents in forecasting and understanding complex international events! ๐Ÿš€
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต 4/N Forecasting Task ๐Ÿ”ฎ Forecasting involves collecting essential historical data and performing temporal reasoning to predict future events. ๐Ÿ“… Example: Forecasting cross-country relations on 2023-11-18 using event and news information up to 2023-11-17.
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต 8/N Forecasting with Temporal Distance Our ablation study let agents predicts 1, 7, 30, and 90 days ahead. ๐Ÿ“ŠResults: As days increases, F1๐Ÿ“‰and KL๐Ÿ“ˆ. Agent's accuracy drops for distant events. Longer ones anticipate trend shifts influenced by more factors and complexities.
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต 6/N Agent Framework ๐Ÿ’ก Think: Agent analyzes and plans the next action using API specs. โšก Act: Generates Single Function or Code Block to retrieve data. ๐Ÿš€ Execute: Python interpreter runs the code for observations. These steps are repeated until reaching a final forecast.
Tweet media one
1
0
10
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต 9/N Tool-Use Ordering in Forecasting ๐Ÿ—‚๏ธTool-Use Transition Graph: Agents start with recent events for key info and end with news for context. ๐Ÿง  Freq.(correct) - Freq.(incorrect): Highlight the need for strategic planning in LLM agents for effective forecasting.
Tweet media one
1
0
9
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต 5/N APIs & Environment ๐Ÿ’ป Our comprehensive APIs empower agents to generate code and access the database. ๐Ÿ”ง APIs include data classes and functions for various info types and search conditions. ๐Ÿ”„ Agents can call a single function or generate a code block at each step.
Tweet media one
1
0
9
@chenchenye_ccye
Chenchen Ye
1 month
๐Ÿงต3/N Data ๐ŸŒWith 59,161 unique events and 296,630 unique news articles, we curate a test set of 705 forecasting query-answer pairs. (a)๐Ÿ“Š Circular Chart: The relation hierarchy and distribution in MIRAI. (b-c) ๐Ÿ”ฅ Heatmap: Intensity of global events, from conflict to mediation.
Tweet media one
1
0
9
@chenchenye_ccye
Chenchen Ye
1 month
@nicolayr_ Thanks for sharing your thoughts, Nicolay! Your idea about forecasting from literature sounds really interesting!
1
0
2
@chenchenye_ccye
Chenchen Ye
1 month
@aviaviavi__ Thank you so much, Avi! Looking forward to hearing your thoughts!
0
0
1